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NOVEL THERMOPHILIC POLYMERASE III HOLOENZYME 

FIELD OF THE INVENTION 

The present invention relates to gene and amino acid sequences encoding DNA 
5 polymerase III holocnzyme subunits and structural genes from thermophilic organisms. 
In particular, the present invention provides DNA polymerase III holoenzyme subunits 
of T. thermophilics. The present invention also provides antibodies and other reagents 
useful to identify DNA polymerase III molecules. 

10 BACKGROUND 

Bacterial cells contain three types of DNA polymerases termed polymerase I, II 
and III. DNA polymerase III (pol III) is responsible for the replication of the majority 
of the chromosome. Pol III is referred to as a replicative polymerase; replicative 
polymerases are rapid and highly processive enzymes. Pol I and II are referred to as 

15 non-replicative polymerases although both enzymes appear to have roles in replication. 
DNA polymerase I is the most abundant polymerase and is responsible for some types 
of DNA repair, including a repair-like reaction that permits the joining of Okazaki 
fragments during DNA replication. Pol I is essential for the repair of DNA damage 
induced by UV irradiation and radiomimetic drugs. Pol II is thought to play a role in 

20 repairing DNA damage which induces the SOS response and in mutants which lack 
both pol I and III, pol II repairs UV-induced lesions. Pol 1 and II are monomeric 
polymerases while pol III comprises a multisubunit complex. 

In E. coli, pol III comprises the catalytic core of the E. coli replicase. In E. 
coli, there are approximately 400 copies of DNA polymerase I per cell, but only 10-20 

25 copies of pol III (Kornberg and Baker, DNA Replication, 2d ed., W.H. Freeman & 
Company, [1992], pp. 167; and Wu et al J. Biol. Chem., 259:121 17-12122 [1984]). 
The low abundance of pol III and its relatively feeble activity on gapped DNA . 
templates typically used as a general replication assays delayed its discovery until the 
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availability of mutants defective in DNA polymerase I (Kornberg and Gefter, J. Biol. 
Chem , 47:5369-5375 [1972]). 

The catalytic subunit of pol III is distinguished as a component of E. coli major 
replicative complex, apparently not by its intrinsic catalytic activity, but by its ability 
5 to interact with other replication proteins at the fork. These interactions confer upon 
the enzyme enormous processivity: Once the DNA polymerase III holoenzyme. 
associates with primed DNA, it does not dissociate for over 40 minutes— the time 
required for the synthesis of the entire 4 Mb E. coli chromosome (McHenry, Ann. 
Rev, Biochem., 57:519-550 [1988]). Studies in coupled rolling circle models of the 
10 replication fork suggest the enzyme can synthesize DNA 150 kb or longer without 

dissociation in vitro (Mok and Marians, J. Biol. Chem., 262:16644-16654 [1987]; Wu 
et al, J Biol. Chem., 267:4030-4044 [1992]). The essential interaction required for 
this high processivity is an interaction between the a catalytic subunit and a dimer of 
p, a sliding clamp processivity factor that encircles the DNA template like a bracelet, 
15 permitting it to rapidly slide along with the associated polymerase, but preventing it 
from falling off (LaDuca et al, J. Biol. Chem., 261:7550-7557 [1986]; Kong et al., 
Cell 69:425-437 [1992]). The P-a association apparently retains the polymerase on 
the template during transient thermal fluctuations when it might otherwise dissociate. 
The p 2 bracelet cannot spontaneously associate with high molecular weight 
20 DNA, it requires a multiprotein DnaX-complex to open and close it around DNA using 
the energy of ATP hydrolysis (Wickner, Proc. Natl. Acad. Sci. USA 73:35411-3515 
[1976]; Naktinis et al, J. Biol. Chem., 270:13358-13365 [1985]; and Dallmann et 
al l. Biol. Chem., 270:29555-29562 [1995]). In E. coli, the dnaX gene encodes two 
proteins, x and y. y is generated by a programmed ribosomal frameshifting 
25 mechanism five-sevenths of the way through dndX. mRNA, placing the ribosome in a 
-1 reading frame where it immediately encounters a stop codon (Flower and McHenry 
Proc. Natl. Acad. Sci. USA 87:37130717 [1990]; Bunko wa and Walker, Nucl. Acids 
Res., 18:1725-1729 [1990]; and Tsuchihashi and Kornberg, Proc. Natl. Acad. Sci. 
USA 87:2516-2520 [1990]). In E. coli, the DnaX-complex has the stoichiometry 
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y 2 x 2 d } S\X\h (Dallmann and McHenry, J. Biol. Chem., 270:29563-29569 [1995]). The 
x protein contains an additional carboxyl-terminal domain that interacts tightly with the 
polymerase, holding two polymerases together in one complex that can coordinately 
replicate the leading and lagging strand of the replication fork simultaneously 
5 (McHenry, J. Biol. Chem., 257:2657-2663 [1982]; Studwell and O'Donnell, Biol. 

Chem., 266:19833-19841 [1991]; McHenry, Ann. Rev. Biochem. 57:519-550 [1988]). 

Pol Ills are apparently conserved throughout mesophilic eubacteria. In addition 
to E. coli and related proteobacteria, the enzyme has been purified from the firmicute 
Bacillus subtilis (Low et'aL, J. Biol. Chem., 251:1311-1325 [1976]; Hammond and 

10 Brown [1992]). With the proliferation of bacterial genomes sequenced, by inference 
from DNA sequence, pol III exits in organisms as widely divergent as Caulobacter, 
Mycobacteria, Mycoplasma, B. subtilis and Synechocystis. The existence of dnaX and 
dnaN (structural gene for P) is also apparent in these organisms. These general 
replication mechanisms are conserved even more broadly in biology. Although 

15 eukaryotes do not contain polymerases homologous to pol III, eukaryotes contain 

special polymerases devoted to chromosomal replication and p-like processivity factors 
(PCNA) and DnaX-like ATPases (RFC, Activator I) that assemble these processivity 
factors on DNA (Yoder and Burgers, J. Biol. Chem., 266:22689-22697 [1991]; Brush 
and Stillman, Meth. EnzymoL, 262:522-548 [1995]; Uhlmann et al. 9 Proc. Natl. Acad. 

20 Sci. USA 93:6521-6526 [1996]). 

In spite of the apparent ubiquity of pol Ills and their associated factors required 
to function as a replicase, the identification of such enzymes remains to be 
accomplished for many other organisms. 

25 SUMMARY OF THE INVENTION 

The present invention relates to gene and amino acid sequences encoding DNA 
polymerase III holoenzyme subunits and structural genes from thermophilic organisms. 
In particular, the present invention provides DNA polymerase III holoenzyme subunits 
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of T. thermophilics. The present invention also provides antibodies and other reagents 
useful to identify DNA polymerase III molecules. 

In particular, the present invention A DNA polymerase III holoenzyme isolated 
from a thermophilic organism. In particularly preferred embodiments, the thermophilic 
5 organism is a thermophilic eubacteria. In other preferred embodiments, the 

thermophilic organism is selected from a member of the genera Thermits, Thermotoga, 
and Aquiflex. 

The present invention provides nucleotide sequences-including the nucleotide 
sequence set forth in SEQ ID NO:7, as well as sequences comprising fragments of 

10 SEQ ID NO:7, and sequences that are complementary to SEQ ID NO:7. In alternative 
embodiments, the present invention provides the nucleotide sequence of SEQ ID NO:7, 
wherein the nucleotide sequence further comprises 5' and 3' flanking sequences, 
and/or intervening regions. 

The present invention also provides recombinant DNA vectors^ such as vectors 

15 comprising SEQ ID NO:7. In an alternative embodiment, the present invention 
provides host cells containing these recombinant vectors. 

The present invention also provides a purified dnaX protein encoded by an 
oligonucleotide comprising a nucleotide sequence substantially homologous to the 
coding strand of the nucleotide sequence of SEQ ID NO: 7. The present invention 

20 provides full-length, as well as fragments of any size comprising the protein (i.e., the 
entire amino acid sequence of the protein, as well as short peptides). In particularly 
preferred embodiments, the driaX protein is from Thermus thermophilics: In other 
preferred embodiments, the dnaX proteins comprises at least a portion of the amino 
acid sequence set forth in SEQ ID NO:9. 

25 The present invention also provides a fusion protein(s) comprising a portion of 

the dnaX protein and a non-dnaX protein sequence. In some preferred embodiments, 
the dnaX protein comprises SEQ ID NO:9. 

The present invention also provides isolated amino acid sequences as set forth 
in SEQ ID NO;2. In yet other embodiments, the present invention provides an 

- 4 - 
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isolated nucleotide sequence encoding the amino acid sequence set forth in SEQ ID 
NO:2. In additional embodiments, the present invention provides a purified tau protein 
encoded by a polynucleotide sequence substantially homologous to the coding strand 
of the nucleotide sequences encoding tau protein. In alternative embodiments, the 

5 present invention provides nucleotide sequences encoding at least a portion of tau 

protein, wherein the nucleotide sequence further comprises 5' and 3' flanking regions 
and/or intervening regions. Thus, the present invention also encompasses fragments of 
any size, comprising tau protein amino acid or nucleic acid sequences. In preferred 
embodiments, the tau protein of the present invention is from Thermus thermophilics.* 

10 In an alternative embodiment, the present invention provides recombinant 

vectors comprising at least a portion of the nucleotide sequence encoding tau protein. 
In yet other embodiments, the present invention provides host cells containing at least 
one recombinant DNA vector comprising at least a portion of tau protein. In further 
embodiments, the present invention provides fusion protein(s) at least a portion of tau 

15 protein and a non-tau protein sequence. 

The present invention also provides the amino acid sequence set forth in SEQ 
ID NO: I. In one embodiment, the present invention provides an isolated nucleotide 
sequence encoding the amino acid sequence set forth in SEQ ID NO: I. In other 
embodiments, the present invention provides at least a portion of purified gamma 

20 protein(s) encoded by a polynucleotide sequence substantially homologous to the, 

coding strand of the nucleotide sequence that encodes gamma protein. In yet other , 
embodiments, the presenHnvention provides nucleotide sequences that further 
comprise 5' and 3' flanking regions and/or intervening regions. In preferred 
embodiments, the present invention provides gamma protein that is from Thermus 

25 thermophilics. 

The present invention also provides recombinant vectors comprising at least a 
portion of a nucleotide sequence that encodes gamma protein. In yet other 
embodiments, the present invention also provides host cells containing the recombinant 
DNA vectors comprising at least a portion of nucleotide sequence encoding gamma 

- 5 - ' • 
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protein. In alternative embodiments, the present invention provides fusion protein(s) 
comprising a portion of the gamma protein, and non-gamma protein sequence(s). 

The present invention further provides the isolated nucleotide sequence set forth 
in SEQ ID NO: 196. In some embodiments, the nucleotide sequence further comprises 
5 5' and 3' flanking regions and/or intervening regions. In yet other embodiments, the 
present invention provides at least a portion of purified DnaE protein encoded by an 
oligonucleotide comprising at least a portion of nucleotide sequence substantially 
homologous to the coding strand of the nucleotide sequence SEQ ID NO: 196. In 
particularly preferred embodiments, the DnaE protein is from Thermus thermophilus. 
10 In still further embodiments, the DnaE protein comprises the amino acid sequence set 
forth in SEQ ID NO: 197. 

The present invention also provides recombinant vectors comprising at least a 
portion of the nucleotide sequence that encodes the DnaE protein. In alternative 
. embodiments, the recombinant vectors are present within a host cell. In still other 
1 5 embodiments, the present invention provides fusion protein comprising at least a 
portion of the DnaE protein and. a non-DnaE protein sequence. In preferred 
embodiments of the fusion proteins, the DnaE protein comprises SEQ ID NO: 197. 

The present invention also provides an isolated nucleotide sequence as set forth 
in SEQ ID NO:214. In some embodiments, the nucleotide sequence further comprises 
20 5' and 3' flanking and/or intervening regions. In yet other embodiments, the present 
invention provides at least a portion of purified DnaQ protein encoded by an 
oligonucleotide comprising at least a portion of nucleotide sequence substantially 
homologous to the coding strand of the nucleotide sequence SEQ ID NO:214. In 
particularly preferred embodiments, the DnaQ protein is from Thermus thermophilus. 
25 In still further embodiments, the DnaQ protein comprises the amino acid sequence set 
forth in SEQ ID NO:2 15 and/or SEQ ID NO:216. 

The present invention also provides recombinant vectors comprising at least a 
portion of the nucleotide sequence that encodes the DnaQ protein. In alternative 
embodiments, the recombinant vectors are present within a host cell. In still other 

- 6 - 
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embodiments, the present invention provides fusion protein comprising at least a 
portion of the DnaQ protein and a non-DnaQ protein sequence. In preferred 
embodiments of the fusion proteins, the DnaQ protein comprises. SEQ ID NO:215 
and/or SEQ ID NO:2 16. 

5 The present invention also provides the isolated nucleotide sequence set forth in 

SEQ ID NO:221. In some embodiments, the nucleotide sequence further comprises 
5' and 3' flanking and/or intervening regions. In yet other embodiments, the present 
invention provides at least a portion of purified DnaA protein encoded by an 
oligonucleotide comprising at least a portion of nucleotide sequence substantially 

1 0 homologous to the coding strand of the nucleotide sequence SEQ ID NO:221 . In 

particularly preferred embodiments, the DnaA protein is from Thermus thermophilus. 
In still further embodiments, the DnaA protein comprises the amino acid sequence set 
forth in SEQ ID NO: 222. • 

The present invention also provides recombinant vectors comprising at least a 

15 portion of the nucleotide sequence that encodes the DnaA protein. In alternative 
embodiments, the recombinant vectors are present within a host cell. In still other 
embodiments, the present invention provides fusion protein comprising at least a 
portion of the DnaQ protein and a non-DnaQ protein sequence. In preferred 
embodiments of the fusion proteins, the DnaQ protein comprises SEQ ID NO:222. 

20 The present invention also provides the isolated nucleotide sequence set forth in 

SEQ ID NO:230. In some embodiments, the nucleotide sequence further comprises 
5' and 3' flanking and/or intervening regions. In yet other embodiments, the present 
invention provides at least a portion of purified DnaN protein encoded by an 
oligonucleotide comprising at least a portion of nucleotide sequence substantially 

25 homologous to the coding strand of the nucleotide sequence SEQ ID NO:230. In 

particularly preferred embodiments, the DnaN protein is from Thermus (hemophilus. 
In still further embodiments, the DnaN protein comprises the amino acid sequence set 
forth in SEQ ID NO:231. 
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The present invention also provides recombinant vectors comprising at least a 
portion of the nucleotide sequence that encodes the DnaN, protein. In alternative 
embodiments, the recombinant vectors are present within a host cell. In still other 
embodiments, the present invention provides fusion protein comprising at least a 

5 portion of the DnaN protein and a non-DnaN protein sequence. In preferred 

embodiments of the fusion proteins, the DnaN protein comprises SEQIDN0:231. 

The present invention also provides methods for detecting DNA polymerase HI 
comprising: providing in any order, a sample suspected of containing DNA 
polymerase III, an antibody capable of specifically binding to a at least a portion of 

10 the DNA polymerase III; mixing the sample and the antibody under conditions wherein 
the antibody can bind to the DNA polymerase HI; and detecting the binding. In v 
preferred embodiments of the methods, the sample comprises a thermophilic organism. 
In alternative preferred embodiments, the thermophilic organism is member of the 
genus Thermits. The methods of the present invention encompass any method for 

15 detection. 

The present invention also provides methods for detection of polynucleotides 
encoding at least a portion of DNA polymerase III holoenzyme (or DNA polymerase 
III holoenzyme subunit) in a biological sample comprising the steps of: a) hybridizing 
at least a portion of the polynucleotide sequence comprising at least fifteen nucleotides, 

20 which hybridizes under stringent conditions to at least a portion of the polynucleotide 
sequence selected from the group consisting of the DNA sequences set forth in SEQ 
ID NOS:7, 20, 21, 22, 40, 41, 42, 58, 63, 64, 65, 66, 67, 77, 78, 79, 88, 91, 92, 97, 
110, 1 1 1, 112, 113, 114, 115, 116, 134, 135, 136, 137, .156, 157, 173, 174, 174, 176, 
177, 178, 179, 180, 190, 196, 214; 221, and 230, to nucleic acid material of a 

25 biological sample, thereby forming a hybridization complex; and b) detecting the 
hybriHization complex, wherein the presence of the complex correlates with the 
presence of a polynucleotide encoding at least a portion of DNA polymerase III 
holoenzyme (or DNA polymerase III holoenzyme subunit) in the biological sample. In 
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one alternative embodiment of the methods, . the nucleic acid material of the biological 
sample is amplified by the polymerase chain reaction. 

The present invention also provides an antibody, wherein the antibody is 
capable of specifically binding to at least one antigenic determinant on the protein 
5 encoded by an amino acid sequence selected from the group comprising SEQ ID 

NOS:l, 2, 8, 9, 26, 31, 34, 37, 59, 70, 75, 82, 85, 197, 198, 215, 216, 222, 223, 225, 
226, and 231. The present invention encompasses polyclonal, as well as monoclonal 
antibodies. 

The present invention also provides methods for producing anti-DNA 
10 polymerase III holoenzyme and anti-DNA polymerase III holoenzyme subunit 
antibodies comprising, exposing an animal having immunocompetent cells to an 
immunogen comprising at least an antigenic portion of DNA polymerase III 
holoenzyme (or holoenzyme subunit) protein, under conditions such that 

immunocompetent cells produce antibodies directed against the portion of DNA 
15 polymerase III protein holoenzyme or holoenzyme subunit. In one embodiment, the 
method further comprises the step of harvesting the antibodies. In an alternative 
embodiment, the method comprises the step of fusing the immunocompetent cells with 
an immortal cell line under conditions such that an hybridoma is produced. In yet 
another embodiment, the portion of DNA polymerase III protein or subunit protein 
20 used as an immunogen to generate the antibodies is selected from the group consisting 
of SEQ ID NOS:l, 2, 8, 9, 16, 17, 18, 19, 23, 26, 31, 34, 37, 45, 50, 55, 59, 601, 70, 
75, 82, 88, 89, 90, 105, 106, 107, 109, 117, 181, .184, 187, 197, 198, 215, 216, 222, 
and 231. In other embodiments, me immunogen comprises a fusion protein. In yet 
another embodiment, the fusion protein comprises at least a portion of DNA 
25 polymerase III holoenzyme or holoenzyme subunit protein. 

The present invention also provides methods for detecting DNA polymerase III 
holoenzyme or holoenzyme subunit expression comprising the steps of: a) providing a 
sample suspected of containing DNA polymerase III holoenzyme or holoenzyme III 
subunit; and a control containing a quantitated DNA polymerase III holoenzyme or 
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holoenzyme III subunit protein, as appropriate; and b) comparing the test DNA 
polymerase III holoenzyme or holoenzyme subunit, in the sample with the quantitated 
DNA polymerase III holoenzyme or holoenzyme subunit in the control to determine 
the relative concentration of the test DNA polymerase III holoenzyme or holoenzyme 
5 III subunit in the sample. In addition, the methods may be conducted using any 
suitable means to determine the relative concentration of DNA polymerase III 
holoenzyme or holoenzyme subunit in the test and control samples, including but not 
. limited to the means selected from the group consisting of Western blot analysis, 
Northern blot analysis, Southern blot analysis, denaturing polyacrylamide gel 
10 electrophoresis, reverse iranscriptase-coupled polymerase chain reaction, enzyme-linked 
immunosorbent assay, radioimmunoassay, and fluorescent immunoassay.. Thus, the 
methods may be conducted to determine the presence of DNA polymerase III 
holoenzyme or holoenzyme III subunit in the genome of the source of the test sample, 
or the expression of DNA polymerase III holoenzyme or holoenzyme subunit (mRNA 
15 or protein), as well as detect the presence of abnormal or mutated DNA polymerase 
holoenzyme or holoenzyme subunit proteins or gene sequences in the test samples. 

In one preferred embodiment, the presence of DNA polymerase III holoenzyme 
or holoenzyme subunit is detected by immunochemical analysis. For example, the 
immunochemical analysis can comprise detecting binding of an antibody specific for 
20 an epitope of DNA polymerase III holoenzyme or holoenzyme subunit {e.g., SEQTD 
NO:2 or 3). In an another preferred embodiment of the method, the antibody 
comprises polyclonal antibodies, while in another preferred embodiment, the antibody 
is comprises monoclonal antibodies. : 

The antibodies used in the methods invention may be prepared using various 
25 immunogens. In one embodiment, the immunogen is DNA polymerase III holoenzyme 
or holoenzyme subunit peptide, to generate antibodies that recognize DNA polymerase 
III holoenzyme or holoenzyme subunit(s). Such antibodies include, but are not limited 
to polyclonal, monoclonal, chimeric, single chain, Fab fragments, and an Fab 
expression library. 

- 10 - 



BNSDOCID: <WO 9913060A1J_> 



WO 99/13060 PCT/US98/18946 

Various procedures known in the art may be used for the production of 
polyclonal antibodies to DNA polymerase III holoenzyme or holoenzyme subunit. For 
the production of antibody, various host animals can be immunized by injection with 
the peptide corresponding to the DNA polymerase III holoenzyme or holoenzyme 
5 subunit epitope including but not limited to rabbits, mice, rats, sheep, goats, etc. In a 
preferred embodiment, the peptide is conjugated to an immunogenic carrier (e.g., 
diphtheria toxoid, bovine serum albumin (BSA), or keyhole limpet hemocyanin 
[KLH]). Various adjuvants may be used to increase the immunological response, 
depending on the host species, including but not limited to Freund's (complete and 
10 incomplete), mineral gels such as aluminum hydroxide, surface active substances such 
as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet 
hemocyanins, and dinitrophenol. 

For preparation of monoclonal antibodies directed toward DNA polymerase III 
holoenzyme or holoenzyme subunit, any technique that provides for the production of 
15 antibody molecules by continuous cell lines in culture may be used (See, e.g., Harlow 
and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY). These include but are not limited to the hybridoma 
technique originally developed by Kohler and Milstein (Kohler and Milstein, Nature 
256:495-497 [1975]), as well as other techniques known in the art. 
20 According to the invention, techniques described for the production of single 

chain antibodies (U.S. Patent 4,946,778; herein incorporated by reference) can be 
adapted to produce DNA polymerase III holoenzyme or holoenzyme subunit-specific 
single chain antibodies. An additional embodiment of the invention utilizes the 
techniques described for the construction of Fab expression libraries (Huse et al, 
25 Science 246:1275-1281 [1989]) to allow rapid and easy identification of monoclonal 
Fab fragments with the desired specificity for DNA polymerase III holoenzyme or 
holoenzyme subunit. 

Antibody fragments which contain the idiotype (antigen binding region) of the 
antibody molecule can be generated by known techniques. For example, such 
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fragments include but are not limited to: the F(ab')2 fragment which can be produced 
by pepsin digestion of the antibody molecule; the Fab r fragments which can be 
generated by reducing the disulfide bridges of the F(ab')2 fragment, and the Fab 
fragments which can be generated by treating the antibody molecule with papain and a 

5 reducing agent. 

In the production of antibodies, screening for the desired antibody can be 
accomplished by techniques known in the art (e.g., radioimmunoassay, ELIS A 
[enzyme-linked immunosorbent assay], "sandwich 11 immunoassays, immunoradiometric 
assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ , 

10 immunoassays [using colloidal gold, enzyme or radioisotope labels, for example], 
Western Blots, precipitation reactions, agglutination assays (e.g., gel agglutination 
assays, hemagglutination assays, etc.), complement fixation assays, 
immunofluorescence assays, protein A assays, and Immunoelectrophoresis assays, etc. 
In one embodiment, antibody binding is detected by detecting a label on the 

15 primary antibody. In another embodiment, the primary antibody is detected by 

detecting binding of a secondary antibody or reagent to the primary antibody. In a 
further embodiment, the secondary antibody is labeled. . Many means are known in the 
art for detecting binding in an immunoassay and are within the scope of the present 
invention. (As is well known in the art, the immunogenic peptide should be provided 

20 free of the carrier molecule used in any immunization protocol. For example, if the 
peptide was conjugated to KLH, it may be conjugated to BSA, or used directly, in a 
screening assay.) 

The foregoing antibodies can be used in methods known in the art relating to 
the localization and structure of DNA polymerase III holoenzyme or holoenzyme 
25 subunit (e.g., for Western blotting), measuring levels thereof in appropriate biological 
samples, etc The biological samples can be tested directly for the presence of DNA 
polymerase III holoenzyme or holoenzyme subunit using an appropriate strategy (e.g., 
ELIS A or radioimmunoassay) and format (e.g., microwells, dipstick [e.g., as described 
in International Patent Publication WO 93/03367], etc.). Alternatively, proteins in the 
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sample can be size separated {e.g., by polyacrylamide gel electrophoresis (PAGE), in 
the presence or not of sodium dodecyl sulfate (SDS), and the presence of DNA 
polymerase III holoenzyme or holoenzyme subunit detected by immunoblotting 
(Western blotting). Immunoblotting techniques are generally more effective with 
5 antibodies generated against a peptide corresponding to an epitope of a protein, and 
hence, are particularly suited to the present invention. 

The present invention also provides methods for identification of polymerase 
subunits comprising the steps of: a) providing a test sample suspected of containing 
amplifiable nucleic acid encoding at least a portion of a DNA polymerase III subunit 
10 protein; b) isolating the amplifiable nucleic acid from the test sample; c) combining 
the amplifiable nucleic acid with amplification reagents, and at least two primers 
selected from the group consisting of primers having the nucleic acid sequence set 
forth in SEQ ID NOS:21, 22, 86, 8.7, 91, 97, 110, 111, 112, 113, 114, 115, 116, 134, 
135, 136, 137, 156, 157, 173, 174, 176, 177, 208, 209, 210, 21 1, 212, and 213, to 
15 form a reaction mixture; and d) combining the reaction mixture with an amplification 
enzyme under conditions wherein the amplifiable nucleic acid is amplified to form 

amplification product. 

In some embodiments, the methods further comprise the step of detecting the 
amplification product. In some preferred embodiments, the detecting is accomplished 

20 by hybridization of the amplification product with a probe. In yet other embodiments, 
the primers are capable of hybridizing to nucleic acid encoding at least one polymerase 
subunit selected from the group consisting of DnaA, DnaN, DnaE, DnaX, DnaQ, and 
DnaB. In still other embodiments, the test sample comprises nucleic acid obtained 
from a thermophilic organism. In particularly preferred embodiments, the thermophilic 

25 organism is a member of the genus Thermus. 

The present invention also provides amplification methods comprising the steps 
of: a) providing a test sample suspected of containing amplifiable nucleic acid, 
amplification reagents, DNA polymerase, an adjunct component comprising at least 
one subunit of DnaX, DnaE, DnaQ, DnaN, and DnaA, and at least two primers; 
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isolating the amplifiable nucleic acid from the test sample; c) combining the 
amplifiable nucleic acid with the amplification reagents, the adjunct component, the 
DNA polymerase, and the primers under conditions such that amplifiable nucleic acid 
is amplified to form amplification product. In some preferred embodiments, the 
5 method further comprises the step of detecting the amplification product. In yet other 
preferred embodiments, the detecting is accomplished by hybridization of the 
amplification product with a probe. In still other embodiments, the primers are 
capable of hybridizing to nucleic acid encoding at least one polymerase subunit 
selected from the group consisting of DnaA, DnaN, DnaE, DnaX, DnaQ, and DnaB. 

10 In yet other embodiments, the DNA polymerase is selected from the group consisting 
of Taq polymerase, E. coli DNA polymerase I, Klenow, Pfu polymerase, Tth 
polymerase, Tru polymerase, Tfl polymerase, Thermococcus DNA polymerase, and 
Thermotoga DNA polymerase. In. yet other embodiments, the amplification reagent 
comprises components selected from the group consisting of smgle-stranded binding 

15 proteins, helicases, and accessory factors. 

The foregoing explanations of particular assay systems are presented herein for 
purposes of illustration only, in fulfillment of the duty to present an enabling 
disclosure of the invention. It is to be understood that the present invention 
contemplates a variety of immunochemical, amplification, and detection assay 

20 protocols within its spirit and scope. 

DESCRIPTION OF THE FIGURES 

In all of the following Figures that show alignments (DNA or amino acids), the 
25 indicates similar, but not identical residues. In the DNA sequences with 

underlined regions, unless otherwise indicated, the underlining indicates bases 
generated by the degenerate primers used to generate the DNA of interest. Also unless 
otherwise indicated, the sequences between the sequences generated by the primers 
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were used in the searches to generate deduced amino acid sequences (i.e., the primer- 
generated sequences were excluded from the searches). 

Figure 1 shows a graph depicting the inactivation of the M13Gori assay 
mixture at elevated temperatures. 
5 Figure 2 shows the elution profile of Fraction II applied to a cation-exchange 

BioRex-70 column. Fraction II prepared from (A) 2.4 kg cells, and (B) 300 g of cells 
was applied at fraction I in both A and B. 

Figure 3 shows the elution profile of Fraction III applied to a hydrophobic 
ToyoPearl-Ether 65-M column to separate the majority of T. thermophilus DNA 
10 polymerase form pol III. 

Figure 4 shows a Western blot of ToyoPearl-Ether fractions 41-52 developed 
monoclonal antibodies specific for E. coli DNA polymerase III a subunit. 

Figure 5 shows the elution profile of Fraction IV applied to an anion exchange 
Q-Sepharose Fast Flow column. Fraction IV was applied at fraction 1 . 
15 Figure 6 shows a Coomassie blue-stained SDS-polyacrylamide gel containing 

Q-Sepharose fractions 28-40. 

Figure 7 shows the DNA polymerase gap-filling activity of T. thermophilus 
Fraction V at different temperatures. 

Figure 8 shows the amino terminal sequence of the isolated candidate y (SEQ 
20 ID NO:l) and x (SEQ ID NO:2) T. thermophilus dnaX gene products and comparison 
to the homologous sequences of E. coli (SEQ ID NO:3) and B. subtilis (SEQ ID 
NO:4). T. thermophilus is abbreviated as Tth. The sequences where identical or 
similar matches occur are also shown (SEQ ID NOS:5 and 6). 

Figure 9A shows the DNA sequence (SEQ ID NO:7) of T. thermophilus dnaX 

25 and flanking sequences. 

Figure 9B shows the deduced amino acid sequence (SEQ ID NO:8) of the x 
gene product of T. thermophilus dnaX. The peptide sequences directly determined 
from the isolated T. thermophilus DnaX-proteins are underlined. 
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Figure 9C shows a comparison of T: tkermophilus dnaX (SEQ ID NO:9); with 
homologous eubacterial dnaX sequences from E. coli (SEQ ID NO: 10); B. subtilis 
(SEQ ID NO:l 1), Mycoplasma pneumoniae (SEQ ID NO: 12), Caulobacter crescentus 
(SEQ ID NO: 13), and Synechocystis sp. (SEQ ID NO:14)(a consensus sequence is also 
5 shown [SEQ ID NO: 15]). 

Figure 9D shows the deduced sequence of the carboxyl-terminus of f: 
tkermophilus y subunit depending on whether the frameshift is -1 (as in E, coli) or +1 
(SEQ ID NOS: 16-19). 

Figure 10A shows the nucleotide sequence (SEQ ID NO:20) of T. tkermophilus 
10 dnaX product obtained from PCR amplification of T. tkermophilus chromosomal DNA 
with primers XlFa (SEQ ID NO:21) and X139R (SEQ ID NO:22); the sequences 
generated by the primers are underlined in this Figure (SEQ ID NOS: 190 and 85). ; ; 

Figure 10B shows an alignment of the amino acid sequence (SEQ ID NO:23) 
deduced from the underlined primers of 10A-(/.e. ; T. tkermophilus DnaX product), 
15 with B. subtilis DNA polymerase dnaX sequence (SEQ ID NO:24). As with prior 
Figures, matching amino acids are indicated (SEQ ID NO:25). 

Figure 1 1 A shows an alignment of T, tkermophilus 130 kDa polypeptide N- 
terminal amino acid sequence (dnaE)(SEQ ID NO:26) with sequences from M. 
tuberculosis (SEQ ID NO:27) and //. influenzae (SEQ ID NO:28) dnaE gene 
20 sequences. The sequences that are identical between the corresponding peptides are 
also shown (SEQ ID NOS:29 and 30). 

Figure 1 IB shows comparisons between the internal amino acid sequences of 
three T. tkermophilus dnaE regions compared with sequences from E. coli. In this 
Figure, the three T. tkermophilus peptides were #91 (SEQ ID NO:31), #676 (SEQ ID 
25 NO:34), and #853 (SEQ ID NO:37). For E. coli, the peptide number refers to the first 
amino acid residue in the amino acid sequence of the E. coli DNA pol III a subunit 
with which the T. tkermophilus 130 kDa polypeptide aligned (i.e., 92 [SEQ ID 
NO:32]; 676 [SEQ ID NO:35]; and 38 [SEQ ID NO:38]). As with other Figures, 
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identical or similar amino acids are indicated between the T. thermophilus and E. coli 
sequences (SEQ ID NOS:33, 36, and 39). 

Figure 12A shows the nucleotide sequence (SEQ ID NO:40) of T. thermophilus 
dnaE product obtained from PCR amplification of T. thermophilus chromosomal DNA 
5 with primers E1F (SEQ ID NO:87) and E91R (SEQ ID NO:86). The sequences 

underlined in this Figure are the sequences are the regions recognized by these primers 

(SEQ ID NOS:41 and 42). 

Figure 12B shows alignments of the T. thermophilus dnaE amino acid sequence 
(SEQ ID NOS:45, 50, and 55) deduced from Figure 12A (between the underlined 
10 primers), compared with Synechocystis sp. sequences (SEQ ID NO:43, 48, and 53), 
and M tuberculosis sequences (SEQ ID NO:47, 52, and 57). As with other Figures, 
identical or similar amino acids are also shown (SEQ ID NOS:44, 46, 49, 51, 54, and 

56). ■ . : . 

Figure 13 A shows the nucleotide sequence (SEQ ID NO:58) of the first 
15 approximately 1,109 bases of the T. thermophilus dnaE gene. 

Figure 13B shows the preliminary deduced amino acid sequence (SEQ ID 
NO:59) corresponding to the N-terminal portion of the dnaE gene shown in Figure 

13 A. ' ' '■ • • ' 

Figure 13C shows an alignment of the T. thermophilus dnaE gene (SEQ ID 
20 NO:61) with regions of homology in the M. tuberculosis DNA polymerase III a 

subunit sequence (SEQ ID NO:60) and Synechocystis sp. DNA polymerase III a 

suburiit sequence (SEQ ID NO:62). 

Figure 14A shows the region of asymmetric PCR product corresponding to the 

region close to the N-terminal end of the T. thermophilus dnaE gene (SEQ ID 
25 NO:63)(/.e., the "front end of the clone"). In this Figure, the region corresponding to 

the sequence generated by the forward primer is underlined and shown in bold (SEQ 

ID NO:64). 

Figure 14B shows the back end of the clone (SEQ ID NO: 180). In this Figure, 
the underlined bases correspond to part of the Pstl site. 
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Figure 14C shows the alignment of the amino acid sequence of T. thermophilus 
DnaE product deduced from the entire sequence shown in Figure 14B (SEQ ID 
NOS:181, 184, and 187), with E, coli DNA polymerase III sequences (SEQ ID 
NOS:183, 186, and 189). As with prior Figures, identical and. similar amino acids are 

5 indicated (SEQ ID NOS:l 82, 185, and 188). 

Figure ISA 1 shows the deduced nucleotide sequence (SEQ ID NO:65) of T. 
thermophilus dnaA product obtained from PGR amplification of T. thermophilus 
chromosomal DNA between the primers A177Fb (SEQ ID NO: 13 5) and A251Rb 
(SEQ ID NO: 157). The sequences recognized by the primers are underlined in this 

10 Figure (SEQ ID NOS:66 and 67); 

Figure .15 B shows an alignment of the deduced amino acid sequences (SEQ ID 
NOS:70, 75) of T: thermophilus dnaA product, with E. coli dnaA sequence (SEQ ID 
NOS:68, and 73), and 5. subtilis dnaA sequence (SEQ ID NOS.72 and 76). As with 
other Figures, identical and similar residues.are also shown (SEQ ID NOS:69, 71 , and 

15 74). 

Figure 16A shows the nucleotide sequence (SEQ ID NO:60) of T. thermophilus 
dnaQ product obtained from PGR amplification of T. thermophilus chromosomal DNA 
with primers Q12Fa (SEQ ID NO: 173) and Q98Ra (SEQ ID NO: 176). The sequences 
generated by these primers are underlined in this Figure (SEQ ID NOS:78 and 79). 
20 Figure 16B shows an alignment of the deduced amino acid sequences (SEQ ID 

NO:82) of T. thermophilus dnaQ product, with E. coli DNA polymerase III 
holoenzyme e subunit sequence (SEQ ID NO:80), and 5. subtilis DNA polymerase III 
subunit sequence (SEQ ID NO:84). As with other Figures, identical and similar 
. residues are also shown (SEQ ID NOS:81 and 83). 
25 Figure ""'17 A shows the sequence of full-length TV thermophilus dnaE gene (SEQ 

ID NO: 1 96). In this Figure, the* dnaE reading frame is shown in bold and underlined. 

Figure 17B shows the deduced amino acid sequence of the full-length T. 
thermophilus DnaE protein (SEQ ID NO: 197). In this Figure, sequence corresponding 
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to the peptides sequenced from isolated native T. thermophilus DnaE are shown in 
bold. 

Figure 17C shows the alignment of regions from the T thermophilus dnaE 
protein (SEQ ID NO: 198), with homologous regions of eubacterial dnaE genes from 
three representative sequences of E. coll B. subtilis type 1, and Borrelia burgdorferi 
DnaE (SEQ ID NOS:199> 200, and 201, respectively). 

Figure 17D shows the deduced amino acid sequence (SEQ ID NO:223) of Tth 
DnaE containing Biotin/Hexahis tag on the amino terminus. 

Figure 18A shows a major portion of the Tth dnaQ gene encoding epsilon 

(SEQ ID NO:214). 

Figure 18B shows the amino acid sequence (SEQ ID NO:215) of the 
continuous open reading frame encoded by SEQ ID NO:214. 

Figure 18C shows the protein (SEQ ID NO:216) that would be expressed if the 
GTG at codon 36 of SEQ ID NO:214) is used as the initiating codon. 

Figure 18D shows the alignment of regions from the T. thermophlus dnaQ gene 
(SEQ ID NO:217), with homologous regions of eubacterial dnaQ genes from three 
representative sequences of Treponema pallidum, Aquiflex aeolicus, and E. coli DnaE 
(SEQ ID NOS:218, 219, and 220, respectively). 

Figure 19A shows the DNA sequence (SEQ ID NO:221) of the 3' terminal 
portion intergenic region of dnaA that may be the T thermophilus origin of 
replication. 

Figure 19B shows the amino acid sequence (SEQ ID NO:222) of the 5' portion 
of the T thermophilus DnaA open reading frame. 

Figure 20A shows a partial DNA sequence (SEQ ID NO:230) of dnaN . 

Figure 20B shows the deduced amino acid sequence (SEQ ID NO:231) of the 
amino-terminal portion of DnaN encoded by the DNA sequence (SEQ ID NO:230) 
shown in Figure 20A. 
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Figure 20C shows the alignment of regions of T. thermophilics DnaN sequence 
(SEQ ID NO:232) with homologous regions of DnaN from E. coli (SEQ ID NO:233) 
and Streptococcus pneumoniae (SEQ ID NO:234). 

DEFINITIONS 

As used herein, the term "DN A polymerase III holoenzyme" refers to the entire 
DNA polymerase III entity .(Le., all of the polymerase subunits, as well as the other 
associated accessory proteins required for processive replication of a chromosome or 
genome), while "DNA polymerase III" is just the core [a, e, 6]). "DNA polymerase 
III holoenzyme subunit" is used in reference to any of the subunit entities that 
comprise the DNA polymerase III holoenzyme. Thus, the term "DNA polymerase HI" 
encompasses "DNA polymerase HI holoenzyme subunits" and "DNA polymerase III 
subunits." 

The-term, "5 'exonuclease activity" refers to the presence of an activity in a 
protein which is capable of removing nucleotides from the 5' end of an 
oligonucleotide. 5' exonuclease activity may be measured using any of the assays 
provided herein. . ; 

The term "3' exonuclease activity" refers to the presence of an activity in a 
protein which is capable of removing nucleotides from the 3' end of an 
oligonucleotide. V exonuclease activity may be measured using any of the assays 
provided herein. 

The terms "DNA polymerase activity," "synthetic activity" and "polymerase 
activity" are used interchangeably and refer to the ability of an enzyme to synthesize 
new DNA strands by the incorporation of deoxynucleoside triphosphates. The 
examples below provide assays for the measurement of DNA polymerase activity. A 
protein which is can direct the synthesis of new DNA strands by the incorporation of 
deoxynucleoside triphosphates in a template-dependent manner is said to be "capable 
of DNA synthetic activity." 
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A "DNA synthesis terminating agent which terminates DNA synthesis at a 
specific nucleotide base" refers to compounds, including but not limited to, 
dideoxynucleosides having a 2\ 3' dideoxy structure (e.g., ddATP, ddCTP, ddGTP 
and ddTTP). Any compound capable of specifically terminating a DNA sequencing 

5 reaction at a specific base may be employed as a DNA synthesis terminating agent. 

The term "gene" refers to a nucleic acid (e.g., DNA) sequence that comprises 
coding sequences necessary for the production of a polypeptide or precursor (e.g., 
DNA polymerase III holoenzyme or holoenzyme subunit, as appropriate). The 
polypeptide can be encoded by a full length coding sequence or by any portion of the 

10 coding sequence so long as the desired activity or functional properties (e.g., 

enzymatic activity, ligand binding, signal transduction, etc.) of the full-length or 
fragment are retained. The term also encompasses the coding region of a structural 
gene and the including sequences located adjacent to the coding region on both the 5* 
and 3 5 ends for a distance of about 1 kb on either end such that the gene corresponds 

15 to the length of the full-length mRNA. The term "gene" encompasses both cDNA and 
genomic forms of a gene. A genomic form or clone of a gene contains the coding 
region interrupted with non-coding sequences termed "intervening regions" or 
"intervening sequences." The mRNA functions during translation to specify the 
sequence or order of amino acids in a nascent polypeptide. 

20 In particular, the terms "DNA polymerase III holoenzyme" and "holoenzyme 

subunit gene" refer to the full-length DNA polymerase III holoenzyme, and 
holoenzyme subunit nucleotide sequence(s), respectively. However, it is also intended 
that the term encompass fragments of the DNA polymerase III holoenzyme and 
holoenzyme subunit sequences, such as those that encode particular domains of 

25 interest, including subunit proteins, as well as other domains within the full-length 
DNA polymerase III holoenzyme or holoenzyme subunit nucleotide sequence. 
Furthermore, the terms "DNA polymerase III holoenzyme," "holoenzyme subunit 
nucleotide sequence," "DNA polymerase III holoenzyme," and "holoenzyme subunit 
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polynucleotide sequence" encompasses DNA, cDNA, and RNA mRNA) 
sequences. 

Where "amino acid sequence" is recited herein to refer to an amino acid 
sequence of a naturally occurring protein molecule, "amino acid sequence" and like 
5 terms, such as "polypeptide" or "protein" are not meant to limit the amino acid 
sequence to the complete, native amino acid sequence associated with the recited 
proteins. 

Genomic forms of a gene may also include sequences located on both the 5.' 
and 3' end of the sequences which are present on the RNA transcript. These 

10 sequences are referred to as "flanking" sequences or regions (these flanking sequences 
are located 5' or -3* to the non-translated sequences present on the mRNA transcript). 
The 5' flanking region may contain regulatory sequences such as promoters and 
enhancers which control or influence the transcription of the gene. The 3' flanking 
region may contain sequences which direct the termination of.transcription, ... 

15 post-transcriptional cleavage and polyadenylation. ' 

The term "wild-type" refers to a gene or gene product which has the 
characteristics of that gene or gene product when isolated from a naturally occurring 
source. A wild-type gene is that which is most frequently observed in a population 
and is thus arbitrarily designed the "normal" or "wild-type" form of the gene. In 

20 contrast, the term "modified" or "mutant" refers to a gene or gene product which 
displays modifications in sequence and or functional properties (i.e., altered 
characteristics) when compared to the wild-type gene or gene product. It is noted that 
naturally-occurring mutants can be isolated; these are identified by the fact that they 
have altered characteristics when compared to the wild-type gene or gene product. 

25 As used herein, the terms "nucleic acid molecule encoding," "DNA. sequence 

encoding," and "DNA encoding" refer to the order or sequence of 
deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these 
deoxyribonucleotides determines the order of amino acids along the polypeptide 
(protein) chain. The DNA sequence thus codes for the amino acid sequence. 
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The term "oligonucleotide" as used herein is defined as a molecule comprised 
of two or more deoxyribonucleotides or ribonucleotides, preferably more than three, 
and usually more than ten. The exact size will depend on many factors, which in turn 
depends on the ultimate function or use of the oligonucleotide. The oligonucleotide 
5 may be generated in any manner, including chemical synthesis, DNA replication, 
reverse transcription, or a combination thereof. 

Because mononucleotides are reacted to make oligonucleotides in a manner 
such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' 
oxygen of its neighbor in one direction via a phosphodiester linkage, an end of an 

10 oligonucleotide is referred to as the "5' end" if its 5' phosphate is not linked to the 3* 
oxygen of a mononucleotide pentose ring and as the "3' .end" if its 3' oxygen is not 
linked to a 5' phosphate of a subsequent mononucleotide pentose ring. As used 
herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may 
be said to have 5' and 3' ends. 

1 5 When two different, non-overlapping oligonucleotides anneal to different 

regions of the same linear complementary nucleic acid sequence, arid the 3 r end of one 
oligonucleotide points towards the 5' end of the other, the former may be called the 
"upstream" oligonucleotide and the latter the "downstream" oligonucleotide. In either 
a linear or circular DNA molecule, discrete elements are referred to as being 

20 "upstream" or 5' of the "downstream" or 3' elements. This terminology reflects the 
fact that transcription proceeds in a 5' to 3' fashion along the DNA strand. The 
. promoter and enhancer elements which direct transcription of a linked gene are 
generally located 5' or upstream of the coding region. However, enhancer elements 
can exert their effect even when located 3' of the promoter element and the coding 

25 region. Transcription termination and polyadenylation signals are located 3' or 
downstream of the coding region. 

As used herein the term "coding region" when used in reference to structural 
gene refers to the nucleotide sequences which encode the amino acids found in the 
nascent polypeptide as a result of translation of a mRNA molecule. The coding region 
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10 



15 



20 



is bounded on the 5' side by the nucleotide triplet "ATG" which encodes the initiator 
methionine and oh the 3' side by one of the three triplets which specify stop codons . 
(i.e., TAA, TAG, TGA). 

As used herein, the terms "an oligonucleotide having a nucleotide sequence 
encoding a gene" and "polynucleotide having a nucleotide sequence encoding a gene," 
means a nucleic acid sequence comprising the coding region of a gene or in other 
words the nucleic acid sequence which encodes a gene product. The coding region 
may be present in either a cDNA, genomic DNA or RNA form. When present in a 
DNA form, the oligonucleotide or polynucleotide may be single-stranded (i.e., the 
sense strand) or double-stranded. Suitable control elements such as 
enhancers/promoters, splice junctions, polyadenylation signals, etc. may be placed in 
close proximity to the coding region of the gene if needed to permit proper initiation 
of transcription and/or correct processing of the primary RNA transcript. 
Alternatively, the coding region- utilized in the expression vectors of the present 
invention may contain endogenous enhancers/promoters, splice junctions, intervening 
sequences, polyadenylation signals, etc., or a combination of both endogenous and 
exogenous control elements. 

As used herein, the term "regulatory element" refers to a genetic element which 
controls some aspect of the expression of nucleic acid sequences. For example, a 
promoter is a regulatory element which facilitates the initiation of transcription of an 
operably linked coding region. Other regulatory elements are splicing signals, 
polyadenylation signals, termination signals, etc. (defined infra). 

Transcriptional control signals in eukaryotes comprise "promoter" and 
"enhancer" elements. Promoters and enhancers consist of short arrays of DNA 
sequences that interact specifically with cellular proteins involved in transcription. 
(Maniatis et ah , Science 236:1237 [1987]). Promoter and enhancer elements have 
been . isolated from a variety of eukaryotic sources including genes in yeast, insect and 
mammalian cells and viruses (analogous control elements, i.e., promoters, are also 
found in prokaryote). The selection of a particular promoter and enhancer depends on 
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what cell type is to be used to express the protein of interest. Some eukaryotic 
promoters and enhancers have a broad host range while others are functional in a 
limited subset of cell types (for review see, Voss et al t Trends Biochem. Sci., 11:287 
[1986]; and Maniatis et ai, supra). For example, the SV40 early gene enhancer is 
5 very active in a wide variety of cell types from many mammalian species and has been 
widely used for the expression of proteins in mammalian cells (Dijkema et aL, EMBO 
J. 4:761 [1985]). Two other examples of promoter/enhancer elements active in a 
broad range of mammalian cell types are those from the human elongation factor la 
gene (Uetsuki et aL, J. Biol. Chem., 264:5791 [1989]; Kim et aL, Gene 91:217 [1990]; 

10 and Mizushima and Nagata, Nucl. Acids. Res., 18:5322 [1990]) and the long terminal 
repeats of the Rous sarcoma virus (Gorman et aL, Proc. Natl. Acad. Sci. USA 79:6777 
[1982]) and the human cytomegalovirus (Boshart et aL, Cell 41:521 [1985]). 

As used herein, the term "promoter/enhancer" denotes a segment of DNA 
which contains sequences capable of providing both promoter and enhancer functions 

15 (i.e., the functions provided by a promoter element and an enhancer element, see 

above for a discussion of these functions). For example, the long terminal repeats of 
retroviruses contain both promoter and enhancer functions; The enhancer/promoter 
may be "endogenous" or "exogenous" or "heterologous." An "endogenous" 
enhancer/promoter is one which is naturally linked with a given gene in the genome. 

20 An "exogenous" or "heterologous" enhancer/promoter is one which is placed in 

juxtaposition to a gene by means of genetic manipulation (/.<?., molecular biological 
techniques) such that transcription of that gene is directed by the linked 
enhancer/promoter. 
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Efficient expression of recombinant DNA sequences in eukaryotic cells requires 
expression of signals directing the efficient termination and polyadenylation of the 
resulting transcript. Transcription termination signals are generally found downstream 
of the polyadenylation signal and are a few hundred nucleotides in length. The term 
5 "poly A site" or ''poly A sequence" as used herein denotes a DNA sequence which 
directs both the termination and polyadenylation of the nascent RN A transcript. 
Efficient polyadenylation of the recombinant transcript is desirable as transcripts 
lacking a poly A tail are unstable and are rapidly degraded. The poly A signal utilized 
in an expression vector may be "heterologous" or "endogenous." An endogenous poly 

10 A signal is one that is found naturally at the 3' end of the coding region of a given 
gene in the genome. A heterologous poly A signal is one which is one which is 
isolated from one gene and placed 3*. of another gene. A commonly used heterologous 
poly A signal is the SV40 poly A signal. The SV40 poly A signal is contained on a 
237 bp BamHl/Bcll restriction fragment and jdirects both, termination and , 

15 polyadenylation (Sambrook, supra, at 16.6-16.7). 

As used herein, the term "vector" is used in reference to nucleic acid molecules 
that transfer DNA segment(s) from one cell to another. The term "vehicle" is 
sometimes used interchangeably with "vector." 

The term "expression vector" as used herein refers to a recombinant DNA 

20 molecule containing a desired coding sequence and appropriate nucleic acid sequences 
necessary for the expression of the operably linked coding sequence in a particular 
host organism. Nucleic acid sequences necessary for expression in prokaryotes usually 
include a promoter, an operator (optional), and a ribosome binding site, often along 
with other sequences. Eukaryotic cells are known to utilize promoters, enhancers, and 

25 termination and polyadenylation signals. 

The term "transfection" as used herein refers to the introduction of foreign 
DNA into eukaryotic cells. Transfection may be accomplished by a variety of means 
known to the art including calcium phosphate-DNA co-precipitation, DEAE-dextran- 
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J mediated transfection, polybrene-mediated transfection, electroporation, microinjection, 
liposome fusion, lipofection, protoplast fusion, retroviral infection, and biolistics. 

As used herein, the term "selectable marker 1 ' refers to the use of a gene which 
encodes an enzymatic activity that confers the ability to grow in medium lacking what 
5 would otherwise be an essential nutrient {e.g. the HIS3 gene in yeast cells); in 

addition, a selectable marker may confer resistance to an antibiotic or drug upon the 
cell in which the selectable marker is expressed. Selectable markers may be 
"dominant"; a dominant selectable marker encodes an enzymatic activity which can be 
detected in any eukaryotic cell line. Examples of dominant selectable markers include 

10 the bacterial aminoglycoside 3' phosphotransferase gene (also referred to as the neo 
gene) which confers resistance to the drug G418 in mammalian cells, the bacterial 
hygromycin G phosphotransferase (hyg) gene which confers resistance to the antibiotic 
hygromycin and the bacterial xanthine-guanine phosphoribosyl transferase gene (also 
referred to as the gpt gene) which confers the ability to grow in the presence of 

15 mycophenolic acid. Other selectable markers are not dominant in that there use must 
be in conjunction with a cell line that lacks the relevant enzyme activity. Examples of 
non-dominant selectable markers include the thymidine kinase (tk) gene which is used 
in conjunction with tk' cell lines, the CAD gene which is used in conjunction with 
CAD-deficient cells and the mammalian hypoxanthine- guanine phosphoribosyl 

20 transferase (hpri) gene which is used in conjunction with hprf cell lines. A review of 
the use of selectable markers in mammalian cell lines is provided in Sambrook et al., 
Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory 
Press, New York (1989) pp. 16.9-16. 15. 

Eukaryotic expression vectors may also contain "viral replicons "or "viral 

25 origins of replication." Viral replicons are viral DNA sequences which allow for the 
extrachromosomal replication of a vector in a host cell expressing the appropriate 
replication factors. Vectors which contain either the SV40 or polyoma virus origin of 
replication replicate to high copy number, (up to 10 4 copies/cell) in cells that express 
the appropriate viral T antigen. Vectors which contain the replicons from bovine 
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papillomavirus or Epstein-Barr virus replicate, extrachromosomally at low copy number 
(-100 copies/cell). 

The thermophilic DNA polymerase III holoenzyme or holoenzyme subunits 
may be expressed in either prokaryotic or eukaryotic host cells. Nucleic acid encoding 
the thermophilic DNA polymerase III holoenzyme or holoenzyme subunit may be 
introduced into bacterial host cells by a number of means including transformation of 
bacterial cells made competent for transformation by treatment with calcium chloride 
or by electroporation. If the thermophilic DNA polymerase III holoenzyme or 
holoenzyme subunit are to be expressed in eukaryotic host cells, nucleic acid encoding 
the thermophilic DNA polymerase III holoenzyme or holoenzyme subunit may be 
introduced into eukaryotic host cells by a number of means including calcium 
phosphate co-precipitation, spheroplast fusion, electroporation and the like . . When the 
eukaryotic host cell is a yeast cell, transformation may be affected by treatment of the 
- host cells with lithium acetate or by electroporation. - ... 

"Hybridization 1 ' methods involve the annealing of a complementary sequence to 
the target nucleic acid (the sequence to be detected). The ability of two polymers of 
nucleic acid containing complementary sequences to find each other and anneal 
through base pairing interaction is a well-recognized phenomenon. : . The initial 
observations of the "hybridization 11 process by Marmur and Lane, (See e.g.; Marmur 
and Lane, Proc. Natl. Acad. Sci. USA 46:453 [I960]); and Doty et al, Proc. Natl. 
Acad. Sci. USA 46:461 [I960]) have been followed by the refinement of this process 
into an essential tool of modern biology. Nonetheless, a number of problems have 
prevented the wide scale use of hybridization as a tool in diagnostics. Among the 
more formidable problems are: 1) the inefficiency of hybridization; 2) the low 
concentration of specific target sequences in a mixture of genomic DNA; and 3) the 
hybridization of only partially complementary probes and targets. 

With regard to efficiency, it is experimentally observed that only a fraction of 
the possible number of probe-target complexes are formed in a hybridization reaction. 
This is particularly true with short oligonucleotide probes (less than 100 bases in 
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length). There are three fundamental causes: a) hybridization cannot occur because of 
secondary and tertiary structure interactions; b) strands of DNA containing the target 
sequence have rehybridized (reannealed) to their complementary strand; and c) some 
target molecules are prevented from hybridization when they are used in hybridization 
5 formats that immobilize the target nucleic acids to a solid surface. 

Even where the sequence of a probe is completely complementary to the 
sequence of the target (i.e., the target's primary structure), the target sequence must be 
made accessible to the probe via rearrangements of higher-order structure. These 
higher-order structural rearrangements may concern either the secondary structure or 

10 tertiary structure of the molecule. Secondary structure is determined by intramolecular 
bonding. In the case of DNA or RNA targets this consists of hybridization within a 
single, continuous strand of bases (as opposed to hybridization between two different 
strands). Depending on the extent and position of intramolecular bonding, the probe 
can be displaced from the target sequence preventing hybridization. 

15 Solution hybridization of oligonucleotide probes to denatured double-stranded 

DNA is further complicated by the fact that the longer complementary target strands 
can renature or reanneal. Again, hybridized probe is displaced by this process. This 
results in a low yield of hybridization (low "coverage") relative to the starting 
concentrations of probe and target. 

20 With regard to low target sequence concentration, the DNA fragment 

containing the target sequence is usually in relatively low abundance in genomic- DNA. 
This presents great technical difficulties; most conventional methods that use 
oligonucleotide probes lack the sensitivity necessary to detect hybridization at such low 
levels. 

25 One attempt at a solution to the target sequence concentration problem is the 

amplification of the detection signal. Most often this entails placing one or more 
labels on an oligonucleotide probe. In the case of non-radioactive labels, even the 
highest affinity reagents have been found to be unsuitable for the detection of single 
copy genes in genomic DNA with oligonucleotide probes. (See, Wallace et al., 
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Biochimie 67:755 [1985]). In the case of radioactive oligonucleotide probes, only 
extremely high specific activities are found to show satisfactory results. {See; 
Studencki and Wallace, DNA 3:1 [1984]; and Studencki et al. 9 Human Genetics 37:42 
[1985]). 

5 With regard to complementarity, it is important for some diagnostic 

applications to determine whether the hybridization represents complete or partial 
complementarity. For example, where it is desired to detect simply the presence or 
absence of pathogen DNA (such as from a virus, bacterium, fungi, mycoplasma, 
protozoan) it is only important that the hybridization method ensures hybridization 

10 when the relevant sequence is present; conditions can be selected where both partially 
complementary probes and completely complementary probes will hybridize. Other 
diagnostic applications, however, may require that the hybridization method distinguish 
between partial and complete complementarity. It may be of interest to detect genetic 
polymorphisms. For example* human hemoglobin is composed, in part, of four 

15 polypeptide chains. Two of these chains are identical chains of 141 amino acids 

(alpha chains) and two of these chains are identical chains of 146 amino acids (beta 
chains). The gene encoding the beta chain is known to exhibit polymorphism. The 
normal allele encodes a beta chain having glutamic acid at the sixth position. The 
mutant allele encodes a beta chain having valine at the sixth position. This difference 

20 in amino acids has a profound (most profound when the individual is homozygous for 
the mutant allele) physiological impact known clinically as sickle cell anemia. It is 
well known that the genetic basis of the amino acid change involves a single base 
difference between the normal allele DNA sequence and the mutant allele DNA 
sequence. 

25 Unless combined with other techniques (such as restriction enzyme^analysis),. 

methods that allow for the same level of hybridization in the case of both partial as 
well as complete complementarity are typically unsuited for such applications; the 
probe will hybridize to both the normal and variant target sequence. Hybridization, 
regardless of the method used, requires some degree of complementarity between the 
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sequence being assayed (the target sequence) and the fragment of DNA used to 
perform the test (the probe). Of course, those of skill in the art knows that one can 
obtain binding without any complementarity but this binding is nonspecific and to be 
avoided. 

5 As used herein, the terms "complementary" or "complementarity" are used in 

reference to polynucleotides (i.e., a sequence of nucleotides) related by the base- 
pairing rules. For example, for the sequence "A-G-T," is complementary to the 
sequence "T-C-A." Complementarity may be "partial," in which only some of the 
nucleic acids' bases are matched according to the. base pairing rules. Or, there may be 

10 "complete" or "total" complementarity between the nucleic acids. The degree of 

complementarity between nucleic acid strands has significant effects on the efficiency 
and strength of hybridization between nucleic acid strands. This is of particular 
importance in amplification reactions, as well as detection methods which depend upon 
binding between nucleic acids. 

15 The term "homology" refers to a degree of complementarity. There may be 

partial homology or complete homology (i.e., identity). A partially complementary 
sequence is one that at least partially inhibits a completely complementary sequence 
from hybridizing to a target nucleic acid is referred to using the functional term 
"substantially homologous." The inhibition of hybridization of the completely 

20 complementary sequence to the target sequence may be examined using a hybridization 
assay (Southern or Northern blot, solution hybridization and the like) under conditions 
of low stringency. A substantially homologous sequence or probe will compete for 
and inhibit the binding (i.e., the hybridization) of a completely homologous to a target 
under conditions of low stringency. This is not to say that conditions of low 

25 stringency are such that non-specific binding is permitted; low stringency £pnditions 
require that the binding of two sequences to one another be a specific (/.<?., selective) 
interaction. The absence of non-specific binding may be tested by the use of a second 
target which lacks even a partial degree of complementarity (e.g., less than about 30% 
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identity); in the absence of non-specific binding the probe will not hybridize to the 
second non-complementary target. 

The art knows well that numerous equivalent conditions may be employed to 
comprise low stringency conditions; factors such as the length and nature (DNA, RNA, 
base composition) of the probe and nature of the target (DNA, RNA, base 
composition, present in solution or immobilized, etc.) and the concentration of the salts 
and other components (e.g., the presence or absence of formamide, dextran sulfate, 
polyethylene glycol) are considered and the hybridization solution may be varied to 
generate conditions of low stringency hybridization different from, but equivalent to, 
the above listed conditions. In addition, the art knows conditions which promote 
hybridization under conditions of high stringency (e.g., increasing the temperature of 
the hybridization and/or wash steps, the use of formamide in the hybridization 
solution, etc.). 

When used in reference to a double-stranded nucleic acid sequence such as a 
cDNA or genomic clone, the term "substantially homologous" refers to any probe 
which can hybridize to either or both strands of the double-stranded nucleic acid 
sequence under conditions of low stringency as described above. 

A gene may produce multiple RNA species which are generated by differential 
splicing of the primary RNA transcript. cDNAs that are splice variants of the same 
gene will contain regions of sequence identity or complete homology (representing the 
presence of the same exon or portion of the same exon on both cDNAs) and regions of 
complete non-identity (for example, representing the presence of exon "A" .on cDNA 1 
wherein cDNA 2 contains exon "B" instead). Because the two cDNAs contain regions 
of sequence identity they will both hybridize to a probe derived from the entire gene 
or portions of the gene containing sequences found on both cDNAs; the two splice 
variants are therefore substantially homologous to such a probe and to each other 

When used in reference to a single-stranded nucleic acid sequence, the term 
"substantially homologous" refers to any probe which can hybridize (i.e., it is the 
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complement of) the single-stranded nucleic acid sequence under conditions of low 
stringency as described. 

As used herein, the term "hybridization" is used in reference to the pairing of 
complementary nucleic acids. Hybridization and the strength of hybridization (/.e., the 
5 strength of the association between the nucleic acids) is impacted by such factors as 
the degree of complementary between the nucleic acids, stringency of the conditions 
involved, the T m of the formed hybrid, and the G:C ratio within the nucleic acids. 

As used herein, the term "T m " is used in reference to the "melting temperature." 
The melting temperature is the temperature at which a population of double-stranded 

1 0 nucleic acid molecules becomes half dissociated into single strands. The equation for 
calculating the T m of nucleic acids is well known in the art. As indicated by standard 
references, a simple estimate of the T m value may be calculated by the equation: T m = 
81.5 + 0.4 1(% G + C), when a nucleic acid is in aqueous solution at 1 M NaCl (See 
e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid 

15 Hybridization [1985]). Other references include more sophisticated computations 
which take structural as well as sequence characteristics into account for the 
calculation of T m . 

As used herein the term "stringency" is used in reference to the conditions of 
temperature, ionic strength, and the presence of other compounds such as organic 

20 solvents, under which nucleic acid hybridizations are conducted. With "high 

stringency" conditions, nucleic acid base pairing will occur only between nucleic acid 
fragments that have a high frequency of complementary base sequences. Thus, 
conditions of "weak" or "low" stringency are often required with nucleic acids that are 
derived from organisms that are genetically diverse, as the frequency of 

25 complementary sequences is usually less. 

"Amplification" is a special case of nucleic acid replication involving template 
specificity. It is to be contrasted with non-specific template replication (i.e., 
replication that is template-dependent but not dependent on a specific template).. 
Template specificity is here distinguished from fidelity of replication (i.e., synthesis of 
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the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. 
Template specificity is frequently described in terms of "target" specificity. Target 
sequences are "targets" in the sense that they are sought to be sorted out from other 
nucleic acid. Amplification techniques have been designed primarily for this sorting 
5 out. 

Template specificity is achieved in most amplification techniques by the choice 
of enzyme. Amplification enzymes are enzymes that, under conditions they are used, 
will process only specific sequences of nucleic acid in a heterogeneous mixture of 
nucleic acid. For example, in the case of QP replicase, MDV-1 RNA is the specific 

10 : template for the replicase (Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038 [1972]). 
Other nucleic acid will not be replicated by this amplification enzyme. Similarly, in 
the case of T7 RNA polymerase, this amplification enzyme has a stringent specificity 
for its own promoters (Chamberlin et al, Nature 228:227 [1970]). In the case of T4 
/ DNA^ligase, the enzyme wiU fiotTigate the two oligonucleotides "or polynucleotides, 

15 where there is a mismatch between the oligonucleotide or polynucleotide substrate and 
the template at the ligation junction (Wu and Wallace, Genomics 4:560 [1989]). 
Finally, Taq and Pfu polymerases, by virtue of their ability to function at high 
temperature, are found to display high specificity for the sequences bounded and thus 
defined by the primers; the high temperature results in thermodynamic conditions that 

20 favor primer hybridization with the target sequences and not hybridization with non- 
target sequences (Erlich (ed.), PCR Technology, Stockton Press [1989]). 

As used herein, the term "amplifiable nucleic acid" is used in reference to 
nucleic acids which may be amplified by any amplification method. It is contemplated 
that "amplifiable nucleic acid" will usually comprise "sample template." 

25 ^ As used herein, the term "sample template" refers to nucleic acid originating 

from a sample which is analyzed for the presence of "target" (defined below). In 
contrast, "background template" is used in reference to nucleic acid other than sample 
template which may or may not be present in a sample. Background template is most 
often inadvertent. It may be the result of carryover, or it may be due to the presence 
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of nucleic acid contaminants sought to be purified away from the sample. For 
example, nucleic acids from organisms other than those to be detected may be present 
as background in a test sample. 

As used herein, the term "primer" refers to an oligonucleotide, whether 
5 occurring naturally as in a purified restriction digest or produced synthetically, which 
is capable of acting as a point of initiation of synthesis when placed under conditions 
in which synthesis of a primer extension product which is complementary to a nucleic 
acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such 
as DNA polymerase and at a suitable temperature and pH). The primer is preferably 

10 single stranded for maximum efficiency in amplification, but may alternatively be 

double stranded. If double stranded, the primer is first treated to separate its strands 
before being used to prepare extension products. Preferably, the primer is an 
oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis 
of extension products in the presence of the inducing agent. The exact lengths of the 

15 primers will depend on many factors, including temperature, source of primer and the 
use of the method. 

A primer is selected to be "substantially" complementary to a strand of specific 
sequence of the template. A primer must be sufficiently complementary to hybridize 
with a template strand for primer elongation to occur. A primer sequence need not 

20 reflect the exact sequence of the template. For example, a non-complementary 

nucleotide fragment may be attached to the 5' end of the primer, with the remainder of 
the primer sequence being substantially complementary to the strand. 
Non-complementary bases or longer sequences can be interspersed into the primer, 
provided that the primer sequence has sufficient complementarity with the sequence of 

25 the template to hybridize and thereby form a template primer complex focjynthesis of 
the extension product of the primer. 

As used herein, the term "nested primers" refers to primers that anneal to the 
target sequence in an area that is inside the annealing boundaries used to start PCR. 
(See, Mullis, et at., Cold Spring Harbor Symposia, Vol. LI, pp. 263-273 [1986]). 
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Because the nested primers anneal to the target inside the annealing boundaries of the 
starting primers, the predominant PCR-amplified product of the starting primers is 
necessarily a longer sequence, than that defined by the annealing boundaries of the 
nested primers. The PCR-amplified product of the nested primers is an amplified 
5 * segment of the target sequence that cannot, therefore, anneal with the starting primers. 

As used herein, the term "probe" refers to an oligonucleotide {i.e., a sequence 
of nucleotides), whether occurring naturally as in a purified restriction digest or 
produced synthetically, recombinantly or by PCR amplification, which is capable of 
: hybridizing to another oligonucleotide of interest A probe may be single-stranded or 

10 double-stranded. Probes are useful in the detection, identification and isolation of 
particular gene sequences. It is contemplated that any probe used in the present 
invention will be labelled with any "reporter molecule," so that is detectable in any 
detection system, including, but not limited to enzyme (e.g., ELISA, as well as 
enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems; 

15 It is not intended that the present invention be limited to any particular detection 
system or label. . . ; 

The term "label" as used herein refers to any atom or molecule which can be 
used to provide a detectable (preferably quantifiable) signal, and which can be attached 
to a nucleic acid or protein. Labels may provide signals detectable by fluorescence, 

20 radioactivity, colorimetry, gravimetry, X-ray diffraction or absorption, magnetism, 
enzymatic activity, and the like. 

As used herein, the term "target," when used in reference to the polymerase 
chain reaction, refers to the region of nucleic acid bounded by the primers used for 
polymerase chain reaction. Thus, the "target" is sought to be sorted out from other 

25 nucleic acid sequences. A "segtnent" is defined as a region of nucleic acuL within the 
target sequence. 

The term "substantially single-stranded" when used in reference to a nucleic 
acid target means that the target molecule exists primarily as a single strand of nucleic 
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acid in contrast to a double-stranded target which exists as two strands of nucleic acid 
which are held together by inter-strand base pairing interactions. 

Nucleic acids form secondary structures which depend on base-pairing for 
stability. When single strands of nucleic acids (single-stranded DNA, denatured 
5 double-stranded DNA or RNA) with different sequences, even closely related ones, are 
allowed to fold on themselves, they assume characteristic secondary structures. An 
alteration in the sequence of the target may cause the destruction of a duplex region(s), 
or an increase in stability of a thereby altering the accessibility of some regions to 
hybridization of the probes oligonucleotides. While not being limited to any particular 

10 theory/ it is thought that individual molecules in the target population may each 
assume only one or a few of the structures (i.e., duplexed regions), but when the 
sample is analyzed as a whole, a composite pattern from the hybridization of the 
probes can be created. Many of the structures that can alter the binding of the probes 
are likely to be only a few base-pairs long and would appear to be unstable. Some of 

15 these structures may be displaced by the hybridization of a probe in that region; others 
may by stabilized by the hybridization of a probe nearby, such that the probe/substrate 
duplex can stack coaxially with the target intrastrand duplex, thereby increasing T the 
. stability of both. The formation or disruption of these structures in response to small 
sequence changes results in changes in the patterns of probe/target complex formation. 

20 As used herein, the term "polymerase chain reaction" ("PCR") refers to .the 

method of Muliis U.S. Patent Nos. 4,683,195 4,683,202, and 4,965,188, hereby 
incorporated by reference, which describe a method for increasing the concentration of 
a segment of a target sequence in a mixture of genomic DNA without cloning or 
purification. This process for amplifying the target sequence consists of introducing a 

25 large excess of two oligonucleotide primers to the DNA mixture containing the desired 
target sequence, followed by a precise sequence of thermal cycling in the presence of a 
DNA polymerase. The two primers are complementary to their respective strands of 
the double stranded target sequence. To effect amplification, the mixture is denatured 
and the primers then annealed to their complementary sequences within the target 
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^molecule. Following annealing, the primers are extended with a polymerase so as to 
form a new pair of complementary strands. The steps of denaturation, primer 
annealing and polymerase extension can be repeated many times (i.e., denaturation, 
annealing and extension constitute one "cycle"; there can be numerous "cycles") to 

5 obtain a high concentration of an amplified segment of the desired target sequence, 
The length of the amplified segment of the desired target sequence is determined by 
the relative positions of the primers with respect to each other, and therefore, this 
length is a controllable parameter. By virtue of the repeating aspect of the process, the 
method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). Because 

10 the desired amplified segments of the target sequence become the predominant 
sequences (in terms of concentration) in the mixture, they are said to be "PCR 
amplified". 

With PCR, it is possible to amplify a single copy of a specific target sequence 
- in genomic DNA to a level detectable by several different methodologies (e.g. , 
15 hybridization with a labeled probe; incorporation of biotinylated primers followed by 
avidin-enzyme conjugate detection; incorporation of 32 P-labeled deoxynucleotide 
triphosphates, such as dCTP or dATP, into the amplified segment). In addition to 
genomic DNA, any oligonucleotide or polynucleotide sequence can be amplified with 
the appropriate set of primer molecules. In particular, the amplified segments created 
20 by the PCR process itself are, themselves, efficient templates for subsequent PCR 
amplifications. 

As used herein, the terms "PCR product," "PCR fragment," and "amplification 
product" refer to the resultant mixture of compounds after two or more cycles of the 
PCR steps of denaturation, annealing and extension are complete. These terms 
25 encompass the case where there has been amplification of one or more segments of 
one or more target sequences. 

As used herein, the term "amplification reagents" refers to those reagents 
(deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for 
primers, nucleic acid template and the amplification enzyme. Typically, amplification 
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reagents along with other reaction components are placed and contained in a reaction 
vessel (test tube, microwell, etc.). 

As used in reference to amplification methods such as PCR, the term 
"polymerase" refers to any polymerase suitable for use in the amplification of nucleic 
5 acids of interest. It is intended that the term encompass such DNA polymerases as the 
polymerase III of the present invention, as well as Tag DNA polymerase (i.e., the type 
I polymerase obtained from Thermus aquaticus), although other polymerases, both 
thermostable and thermolabile are also encompassed by this definition. 

As used herein, the term "RT-PCR" refers to the replication and amplification 
10 of RNA sequences. In this method, reverse transcription is coupled to PCR, most 

often using a one enzyme procedure in which a thermostable polymerase is employed, 
as described in U.S. Patent No. 5,322,770, herein incorporated by reference. In RT- 
PCR, the RNA template is converted to cDNA due to the reverse transcriptase activity 
of the polymerase, and then amplified using the polymerizing activity of the 
15 polymerase (i.e., as in other PCR methods). 

As used herein, the terms "restriction endonucleases" and "restriction enzymes" 
refer to bacterial enzymes, each of which cut double-stranded DNA at or near a 
specific nucleotide sequence. 

As used herein, the term "recombinant DNA molecule" as used herein refers to 
20 a DNA molecule which is comprised of segments of DNA joined together by means of 
molecular biological techniques. 

The terms "in operable combination," "in operable order," and "operably 
linked" as used herein refer to the linkage of nucleic acid sequences in such a manner 
that a nucleic acid molecule capable of directing the transcription of a given gene 
25 and/or the synthesis of a desired protein molecule is produced. The term also refers to 
the linkage of amino acid sequences in such a manner so that a functional protein is 
produced. 

The term "isolated" when used in relation to a nucleic acid, as in "an isolated 
oligonucleotide" or "isolated polynucleotide" refers to a nucleic acid sequence that is 
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identified and separated from at least one contaminant nucleic acid with which it is 
ordinarily associated in its natural source. Isolated nucleic acid is such present in a 
form or setting that is different from that in which it is found in nature,. In contrast, 
non-isolated nucleic acids as nucleic acids such as DNA and RNA found in the state 
they exist in nature. The isolated nucleic acid, oligonucleotide, or polynucleotide may 
be present in single-stranded or double-stranded form. When an isolated nucleic acid, 
oligonucleotide or polynucleotide is to be utilized to express a protein, the . 
oligonucleotide or polynucleotide will contain at a minimum the sense or coding strand 
(i.e., the oligonucleotide or polynucleotide may single-stranded), but may contain both 
the sense and anti-sense strands (/.<?., the oligonucleotide or polynucleotide may be 
double-stranded). 

As used herein, the term "purified" or "to purify" refers to the removal of 
contaminants from a sample. For example, anti-DNA polymerase III holoenzyme and 
holoenzyme subunit antibodies are purified by removal of contaminating non- 
immunoglobulin proteins; they are also purified by the removal of immunoglobulin 
that does not bind DNA polymerase III holoenzyme or holoenzyme subunit. The 
removal of non-immunoglobulin proteins and/or the removal of immunoglobulins that 
do not bind DNA polymerase III holoenzyme or holoenzyme subunit results in an 
increase in the percent of DNA polymerase ill holoenzyme or holoenzyme subunit- 
reactive immunoglobulins in the sample. In another example, recombinant DNA 
polymerase III holoenzyme or holoenzyme subunit polypeptides are expressed in 
bacterial host cells and the polypeptides are purified by. the removal of host cell 
proteins; the percent of recombinant DNA polymerase III holoenzyme or holoenzyme 
subunit polypeptides is thereby increased in the sample. 

The term "recombinant DNA molecule" as used herein refers to a DNA 
molecule which is comprised of segments of DNA joined together by means of 
molecular biological techniques. 

The term "recombinant protein" or "recombinant polypeptide" as used herein 
refers to a protein molecule which is expressed from a recombinant DNA molecule 
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The term "native protein" as used herein to indicate that a protein does not 
contain amino acid residues encoded by vector sequences; that is the native protein 
contains only those amino acids found in the protein as it occurs in nature. A native 
protein may be produced by recombinant means or may be isolated from a naturally 
5 occurring source. 

As used herein the term "portion" when in reference to a protein (as in "a 
portion of a given protein") refers to fragments of that protein. The fragments may 
range in size from four amino acid residues to the entire amino acid sequence minus 
one amino acid. 

10 As used herein, the term "fusion protein" refers to a chimeric protein containing 

the protein of interest (i.e., DNA polymerase III holoenzyme or holoenzyme subunit 
and fragments thereof) joined to an exogenous protein fragment (the fusion partner 
which consists of a non-DNA polymerase III holoenzyme or holoenzyme subunit 
protein). The fusion partner may enhance solubility of the DNA polymerase III 

15 holoenzyme or holoenzyme subunit protein as expressed in a host cell, may provide an 
affinity tag to allow purification of the recombinant fusion protein from the host cell 
or culture supernatant, or both. If desired, the fusion protein may be removed from 
the protein of interest (i.e., DNA polymerase III holoenzyme, holoenzyme subunit 
protein, or fragments thereof) by a variety of enzymatic or chemical means known to 

20 the art. 

A "variant" of DNA polymerase III holoenzyme or holoenzyme subunit, as 
used herein, refers to an amino acid sequence that is altered by one or more amino 
acids. The variant may have "conservative" changes, wherein a substituted amino acid 
has similar structural or chemical properties (e.g., replacement of leucine with 
25 isoleucine). More rarely, a variant may have "nonconservative" changes 

replacement of a glycine with a tryptophan). Similar minor variations may also 
include amino acid deletions or insertions, or both. Guidance in determining which 
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amino acid residues may be substituted, inserted, or deleted without abolishing 
biological or immunological activity may be found using computer programs well 
known in the art, for example, DNASTAR software. 

The term "sequence variation" as used herein refers to differences in nucleic 
5 acid sequence between two nucleic acid templates. For example, a wild-type structural 
gene and a mutant form of this wild- type structural gene may vary in sequence by the 
presence of single base substitutions and/or deletions or insertions of one or more 
nucleotides. These two forms of the structural gene are said to vary in sequence from 
one another. A second mutant form of the structural gene may exist. This second 

10 mutant form is said to vary in sequence from both the wild-type gene and the first 

mutant form of the gene. It is noted, however, that the invention does not require that 
a comparison be made between one or more forms of a gene to detect sequence 
variations. Because the method of the invention generates a characteristic and 
reproducible pattern of complex formation for a given nucleic acid target, a 

15 characteristic "fingerprint" may be obtained from any nucleic target without reference 
to a wild-type or other control. The invention contemplates the use of the method for 
both "fingerprinting" nucleic acids without reference to a control and identification of 
mutant forms of a target nucleic acid by comparison of the mutant form of the target 
with a wild-type or known mutant control. 

20 As used herein, the term "target nucleic acid" refers to the region of nucleic 

acid bounded by the primers used for polymerase chain reaction. Thus, the "target" is 
sought to be sorted out from other nucleic acid sequences. A "segment" is defined as 
a region of nucleic acid within the target sequence. 

The term "nucleotide analog" as used herein refers to modified or non-naturally 

25 occurring nucleotides such as 7-deaza purines (i.e., 7-deaza-dATP and 7-cigaza-dGTP). 
Nucleotide analogs include base analogs and comprise modified forms of 
deoxyribonucleotides as well as ribonucleotides. As used herein the term "nucleotide 
analog" when used in reference to targets present in a PCR mixture refers to the use of 
nucleotides other than dATP, dGTP, dCTP and dTTP; thus, the use of dUTP (a 
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naturally occurring dNTP) in a PCR would comprise the use of a nucleotide analog in 
the PCR. A PCR product generated using dUTP, 7-deaza-dATP, 7-deaza-dGTP or any 
other nucleotide analog in the reaction mixture is said to contain nucleotide analogs. 
"Oligonucleotide primers matching or complementary to a gene sequence" 

5 refers to oligonucleotide primers capable of facilitating the template-dependent 

synthesis of single or double-stranded nucleic acids. Oligonucleotide primers matching 
or complementary to a gene sequence may be used in PCRs, RT-PCRs and the like. 

A ''consensus gene sequence" refers to a gene sequence which is derived by 
comparison of two or more gene sequences and which describes the nucleotides most 

10 often present in a given segment of the genes; the consensus sequence is the canonical 
sequence. "Consensus protein/ 1 "consensus amino acid," consensus peptide," and 
consensus polypeptide sequences refer to sequences that are shared between multiple 
organisms or proteins. 

The term "biologically active," as used herein, refers to a protein or other 

15 biologically active molecules (e.g., catalytic RNA) having structural, regulatory, or 

biochemical functions of a naturally occurring molecule. Likewise, "immunologically 
active" refers to the capability of the natural, recombinant, or synthetic DNA 
polymerase III holoenzyme or holoenzyme subunit, or any oligopeptide or 
polynucleotide thereof, to induce a specific immune response in appropriate animals or 

20 cells arid to bind with specific antibodies. 

The term "agonist," as used herein, refers to a molecule which, when bound to 
DNA polymerase III holoenzyme or holoenzyme subunit, causes a change in DNA 
polymerase III holoenzyme or holoenzyme subunit, which modulates the activity of 
DNA polymerase III holoenzyme or holoenzyme subunit. Agonists may include 

25 proteins, nucleic acids, carbohydrates, or any other molecules which bindjor interact 
with DNA polymerase III holoenzyme or holoenzyme subunit. 

The terms "antagonist" or "inhibitor," as used herein, refer to a molecule 
which, when bound to DNA polymerase III holoenzyme or holoenzyme subunit, 
blocks or modulates the biological or immunological activity of DNA polymerase III 
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holoenzyme or holoenzyme subunit. Antagonists and inhibitors may include proteins, 
nucleic acids, carbohydrates, or any other molecules which bind or interact with DNA 
polymerase HI holoenzyme or holoenzyme subunit 

The term "modulate," as used herein, refers to a change or an alteration in the 
biological activity of DNA polymerase III holoenzyme or holoenzyme subunit. 
Modulation may be an increase or a decrease in protein activity, a change in binding 
characteristics, or any other change in the biological, functional, or immunological 
properties of DNA polymerase III holoenzyme or holoenzyme subunit. 

The term "derivative," as used herein, refers to the chemical modification of a 
nucleic acid encoding DNA polymerase III holoenzyme or holoenzyme subunit, or the 
encoded DNA polymerase III holoenzyme or holoenzyme subunit. Illustrative of such 
modifications would be replacement of hydrogen by an alkyl, acyl, or amino group. A 
nucleic acid derivative would encode a polypeptide which retains essential biological 
characteristics of the natural molecule. 

The term "Southern blot," refers to the analysis of DNA on agarose or 
acrylamide gels to fractionate the DNA according to size followed by transfer of the 
DNA from the gel to a solid support, such as nitrocellulose or a nylon membrane. 
The immobilized DNA is then probed with a labeled probe to detect DNA species 
complementary to the probe used. The DNA may be cleaved with restriction enzymes 
prior to electrophoresis. Following electrophoresis, the DNA may be partially 
depurinated and denatured prior to or during transfer to the solid support. Southern 
blots are a standard tool of molecular biologists (Sambrook et al, Molecular Cloning: 
A Laboratory Manual, Cold Spring Harbor Press, NY, pp 9.31-9.58 [1989]). 

The term "Northern blot," as used herein refers to the analysis of RNA by 
electrophoresis of RNA on agarose gels to fractionate the RNA according^ size 
followed by transfer of the RNA from the gel to a solid support, such as nitrocellulose 
or a nylon membrane. The immobilized RNA is then probed with a labeled probe to 
detect RNA species complementary to the probe used. Northern blots are a standard 
tool of molecular biologists (Sambrook et a/., supra, pp 7.39-7.52 [1989]). 
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The term "Western blot" refers to the analysis of protein(s) (or polypeptides) 
immobilized onto a support such as nitrocellulose or a membrane. The proteins are 
run on acrylamide gels to separate the proteins, followed by transfer of the protein 
from the gel to a solid support, such as nitrocellulose or a nylon membrane. The 

5 immobilized proteins are then exposed to antibodies with reactivity against an antigen 
of interest. The binding of the antibodies may be detected by various methods, 
including the use of radiolabeled antibodies. 

The term "antigenic determinant" as used herein refers to that portion of an 
antigen that makes contact with a particular antibody (Le., an epitope). When a 

10 protein or fragment of a protein is used to immunize a host animal, numerous regions 
of the protein may induce the production of antibodies which bind specifically to a 
given region or three-dimensional structure on the protein; these regions or structures 
are referred to as antigenic determinants. An antigenic determinant may compete with 
the intact antigen (i.e., the "immunogen" used to elicit the immune response) for 

1 5 binding to an antibody. 

The terms "specific binding" or specifically binding" when used in reference to 
the interaction of an antibody and-a protein or peptide means that the interaction is 
dependent upon the presence of a particular structure (i.e., the antigenic determinant or 
epitope) on the protein; in other words the antibody is recognizing and binding to a 

20 specific protein structure rather than to proteins in general. For example, if an 

antibody is specific for epitope "A," the presence of a protein containing epitope A (or 
free, unlabelled A) in a reaction containing labelled "A" and the antibody will reduce 
the amount of labelled A bound to the antibody. 
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As used herein, the term "cell culture" refers to any in vitro culture of cells. 
Included within this term are continuous cell lines (e.g., with an immortal phenotype), 
primary cell cultures, finite cell lines {e.g., non-transformed cells), and any other cell 
population maintained in vitro. 

The-terms "test DNA polymerase III holoenzyme" and "test holoenzyme 
subunit" refers to a sample suspected of containing DNA polymerase III holoenzyme 
or holoenzyme subunit, respectively. The concentration of DNA polymerase III 
holoenzyme or holoenzyme subunit in the test sample is determined by various means, 
and may be compared with a "quantitated amount of DNA polymerase III holoenzyme 
or holoenzyme subunit" (i.e., a positive control sample containing a known amount of 
DNA polymerase III holoenzyme or holoenzyme subunit), in order to determine 
whether the concentration of test DNA polymerase III holoenzyme or holoenzyme 
subunit in the sample is within the range usually found within samples from wild-type 
organisms. " • • -• 

The term "microorganism" as used herein means an organism too small to be 
observed with the unaided eye and includes, but is not limited to bacteria, virus, 
protozoans, fungi, and ciliates. 

The term "microbial gene sequences" refers to gene sequences derived from a 
microorganism. 

The term "bacteria" refers to any bacterial species including eubacterial and 
archaebacterial species. 

The term "virus" refers to obligate, ultramicroscopic, intracellular parasites 
incapable of autonomous replication (i.e., replication requires the use of the host cell's 
machinery). 

DESCRIPTION OF THE INVENTION 

The present invention relates to gene and amino acid sequences encoding DNA 
polymerase III holoenzyme subunits and structural genes from thermophilic organisms. 
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In particular, the present invention provides DNA polymerase III holoenzyme subunits 
of T. thermophilus. The present invention also provides antibodies and other reagents 
useful to identify DNA polymerase III molecules. 

Prior to the present invention, only one type of DNA polymerase had been 

5 discovered in thermophilic eubacteria, even though others have been actively sought 
(See e.g., Lawyer et a/., J. Biol. Chem., 264:6427-6437 [1989]; Chien et al f J. 
Bacterid., 127:1550-1557 [1976]; and Kaledin et al t Biochem., 45:494-501. [1981]). 
The present invention provides a pol Ill-class polymerase and an associated pol III 
holoenzyme auxiliary subunit homolog from the thermophile T. thermophilus. 

10 This invention was developed in a step-wise fashion, using various techniques, 

including of general gap filling assays to monitor polymerase activity and the use of 
cross-reactive monoclonal antibodies against the a catalytic subunit of E. coli DNA 
polymerase III holoenzyme to distinguish the novel polymerase from the well- 
characterized DNA polymerase I-like Thermus thermophilus DNA polymerase. 

15 Indeed, a survey of 12 monoclonal antibodies directed against the 130 kDa a 

subunit of the E. coli pol III holoenzyme revealed a subset that reacted with a protein 
of approximately the same size in Western blots of T. thermophilus extracts. These 
antibodies were used to distinguish the pol HI polymerase of the present invention, 
from the characterized T. thermophilus polymerase during protein fractionation 

20 procedures. 

Two proteins migrating with the polymerase and present in approximate 
stoichiometric ratio after three chromatographic steps were isolated and subjected to 
partial amino acid sequencing. The amino terminus of both were homologous to the 
two products of the E. coli dnaX gene, the y and x subunits of the DNA polymerase 
25 III holoenzyme. Using this information and sequences conserved among ^aX-like 
genes, a gene fragment was isolated by PCR, and used as a probe to isolate the full 
length Thermus thermophilus dnaX gene. The deduced amino acid sequence was 
found to be highly homologous to the DnaX proteins of other bacteria. Examination 
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of the sequence permitted identification of a frameshift site similar to the one used in 
E. coli to direct the synthesis of the shorter y DnaX-gene product 

Conservation of a frameshifting mechanism to generate related ATPases is 
significant in that, by analogy to E. coli, can both assemble a p processivity factor 
. onto primed DNA. Both a 63 kDa t subunit that has a molecular weight consistent 
with its being a full length dnaX translation product, and a 50 kDa y subunit that 
likely arises by translational frameshifting was detected in enzyme purified from T. 
thermophilus extracts. Examination of the dnaX DNA sequence provided confirmation 
of this. In E. coli, ribosomes frameshift at the sequence A AAA AAG into a -1 frame 
where the lysine UUU anticodon tRNA can base pair with 6As before elongating 
(Flower and McHenry, Proc. Natl. Acad. Sci. USA 87:3713-3717 [1990]; Blinkowa . 
and Walker, Nucl. Acids Res., 18:1725-1729 [1990]; and Tsuchihashi and Kornberg, 
Proc. Natl. Acad. Sci. USA 87:2516-2520 [1990]). In T. thermophilus, the putative 
frameshift site has the sequence A AAA AAA A, which would enable either a +1 or 
-1 frameshift. The +1 frameshift product would extend only one residue beyond the 
lys-lys encoding sequence where the frameshift occurs, similar to the E. coli -1 
product. A -1 frameshift would encode a protein with a 12-amino acid extension. 
Such an extension could permit an interaction that may further distinguish y from x 
functionally or could loop back to stabilize its structure in a thermal environment. 

The present invention also provides additional polymerase III holoenzyme 
subunits. For example, the dnaQ sequence of 7! thermophilus (SEQ ID NO:214) and 
partial amino acid sequence of DnaQ (SEQ ID NOS:215 and 216) are provided, as 
well as the dnaE sequence of T. thermophilus (SEQ ID NO: 196), and DnaE amino 
acid sequence (SEQ ID NO: 197). 

It is contemplated that purified DnaQ will find use in PCR and other 
applications in which high fidelity DNA synthesis is required or desirable. Although 
an understanding of the mechanism is not necessary in order to use the present 
invention, DnaQ binds to the a subunit of DNA polymerase III, and works with it to 
efficiently remove errors made by the DNA polymerase III. 
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It is also contemplated that DnaQ will find use in place of an adjunct 
proofreading polymerase in PCR and other amplification amplifications. For example, 
when combined in an amplification reaction with a DNA polymerase that lacks a 
proofreading exonuclease, the DnaQ will facilitate elongation of PCR product as it is 
5 capable of removing mismatches within the PCR product. Thus, it is contemplated 
that the present invention will find use in such applications as long-range PCR (e.g., 
PCR involving 5-50 kb targets). 

In addition to the DnaQ and DnaE components provided by the present 
invention, a portion of dnaN (SEQ ID NO:230) and the corresponding deduced amino 

1 0 acid sequence (SEQ ID NO:23 1 ) are also provided. It is contemplated that the DnaN 
protein will find use in purification of the P subunit (i.e., the critical subunit that 
permits pol III to catalyze a processive (He., long-distance without dissociating) 
amplification reaction. DnaN is useful with pol III alone (e.g.. a or a plus e) on 
linear templates in the absence of additional subunits, or it can be used with the DnaX 

15 complex, as well as with additional proteins (e.g., single-stranded binding proteins, 
helicases, and/or other accessory factors), to permit very long PCR reactions. 

As mentioned above, E. coli pol III holoenzyme can remain associated, with 
primed DNA for 40 minutes and replicates DNA at 500-1000 nucleotides/second 
(McHenry, Ann. Rev. Biochem., 57:519-550 [1988]). Existing PCR technology is 

20 limited by relatively non-processive repair-like DNA polymerases. The present 

invention provides a thermophilic replicase capable of rapid replication and highly 
processive properties at elevated temperatures. It is contemplated that the 
compositions of the present invention will find use in many molecular biology 
applications, including megabase PCR by removing the current length restrictions, long 

25 range DNA sequencing and sequencing through DNA with high secondarjcstructure, as 
well as enabling new technological advances in molecular biology. 
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EXPERIMENTAL 

The following examples serve to illustrate certain preferred embodiments and 
aspects of the present invention and are not to be construed as limiting the scope 
thereof 

5 In the experimental disclosure which follows, the following abbreviations 

apply: g (gram); L (liter); jig (microgram); ml (milliliter); bp (base pair); °C (degrees 
Centigrade); kb or Kb (kilobases); kDa or kd (kilodaltons); EDTA 
(ethylenediaminetetraacetic acid); DTT (dithiothreitol); LB (Luria Broth); -mer 
(oligomer); DMV (DMV International, Frazier, NY); PAGE (polyacrylamide gel . 

10 electrophoresis); SDS (sodium dodecyl sulfate); SDS-PAGE (sodium dodecyl sulfate 
; polyacrylamide gel electrophoresis); SSPE (2x SSPE contains 0.36 mM NaCl, 20 mM 
NaH 2 P0 4 , pH 7.4, and 20 mM EDTA, pH 7.4; the concentration of SSPE used may 
vary), SOP media (20. g/1 tryptone (Difco), 10 g/1 yeast extract (Difco), 5 g/1 NaCl, 2.5 
g/1 potassium-phosphate, dibasic (Fisher), 1 g/1 MgS0 4 7H 2 0 (Fisher), pH 7.2); TE 

1 5 buffer (1 0 mM Tris, 1 mM EDTA); 50 x TAE (242 g Tris base, 57. 1 ml glacial acetic 
acid, 100 ml 0.5 M EDTA pH 8.0); Blotto (10% skim milk dissolved in dH 2 0 and 
0.2% sodium azide); Gel Loading Dye (0.25% Bromophenol blue, 0.25% xylene 
cyanol; 25% Ficoll (Type 400) in dH 2 0); Pre-hybridization mix (50% Formamide, 5X- 
SSPE, 1% SDS, 0.5% CARNATION™ non-fat dried milk, 10% skim milk, 0.2% Na 

20 Azide); FBS (fetal bovine serum); ABS, Inc. (ABS, Inc., Wilmington, DE); 

GerieCodes (GeneCodes, Ann Arbor, MI); Boehringer Mannheim (Boehringer 
Mannheim, Indianapolis, IN); Champion Industries (Champion Industries, Clifton, NJ); 
Organbn (Organon Teknika Corp., Durham NC); Difco (Difco, Detroit, MI); Enzyco 
(Enzyco Inc., Denver, Co); Fisher Scientific (Fisher Scientific, Fair Lawn, NJ); FMC 

25 (FMC, Rockland, Maine); Gibco BRL (Gibco BRL Gaithersburg, MD); Hj^clone 
(Hyclone, Logan UT); Intermountain or ISC (ISC BioExpress, Bountiful, Utah); 
Invitrogen (Invitrogen, Carlsbad, CA); Millipore (Millipore,. Marlborough, MA); MJ 
Research (MJ Research, Watertown, MA); Molecular Probes (Molecular Probes, 
Eugene, OR); National Diagnostics (National Diagnostics, Manville, N J); Pharmacia 
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Biotech (Pharmacia Biotech., Piscataway, NJ); Promega (Promega Corp., Madison, 
WI); Qiagen (Qiagen, Chatsworth, CA); Sigma PE/ABI (Perkin Elmer Applied 
Biosystems Division, Foster City, CA); (Sigma, St. Louis, MO); Stratagene 
(Stratagene, LaJolla CA); Tecan (Tecan, Research Triangle Park, NC); Whatman 
5 (Whatman, Maidstone, England); Lofstrand Labs (Lofstrand Labs, Ltd., Gaithersburg, 
Maryland) and LSPI (LSPI Filtration Products, Life Science Products, Denver, CO); 
Irvine (Irvine Scientific, Irvine CA); and Jackson Labs (Jackson Labs, Bar Harbor, 
Maine). 

In Examples in which a molecular weight based on SDS-PAGE gels is reported 
10 for a protein, the molecular weight values reported are approximate values. In 

addition, all DNA sequences indicated herein are approximate, with the exception of 
the full-length gene sequences for dnaE and dnaX, which are believed to be accurate. 

15 EXAMPLE 1 

Selection of a DNA Polymerase Assay to Monitor the Purification of a 

Multi-Subunit DNA Polymerase III from Therntus thermophilus 
In order to monitor the purification of a multi-subunit DNA polymerase III 
from the thermophilic Thermus thermophilics, the suitability of a DNA polymerase 
20 assay which is commonly used for measurement of the activity of DNA polymerase III 
in mesophilic bacteria {e.g., Escherichia coli) was investigated. However, the 
unsuitability of the prior art assay for use at high temperatures necessitated the 
development of an alternative assay for DNA polymerase activity for use at high 
temperatures. This Example involved a) long single-stranded template mesophilic 
25 DNA polymerase assay, and b) development of a gap-filling assay for DNA 
polymerase activity. 
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A. Long Single-Stranded Template Mcsophilic DNA Polymerase Assay 

For mesophilic bacteria, such as £. coli, the assay system used to measure 
DNA replication employs a relatively long single-stranded DNA template, MBGori, as 
substrate that is primed with RNA at a unique site by the action of dnaG primase (Cull 
and McHenry, Meth. Enzymol., 262:2235 [1995]). The ability to use this assay at 
high temperature for monitoring polymerase III activity from thermophilic organisms 
was investigated as follows. 

The M13Gori template (Enzyco) was first primed with RNA as described 
below. The following components were mixed together oh ice in the following order: 
243 jal of primer-template solution (60 mM'HEPES (pH 7.5), 14 mM magnesium 
acetate, 2.8 mM ATP, GTP, CTP and UTP, 14% glycerol, 56 mM NaCl, 42 mM 
potassium glutamate, 84 ng/ml bovine serum albumin (BSA) and 4 mM DTT), with ^ 
7.2 |il MBGori (2.98 mg/ml), 63 jal E. coli single-stranded binding protein (SSB) 
(Enzyco) (2.2 mg/ml) and 27 ^il DnaG primase (Enzyco) (0.39 mg/ml). This mixture 
was then incubated for 15 min at 30°G to form the "primer-template/ 1 The above 
mixture provided enough primed MBGori for over 100 polymerase assays. 

The DNA polymerase reaction mixture was assembled by adding the following 
components on ice: 300 ^1 of primer-template was added to 2.1 ml of polymerization 
reaction solution (180 mM Bicine (pH 8.0), 25% glycerol, 0.017% Nonidet-P40, 170 
.jig/ml BSA, 83 mM potassium glutamate, 8 mM DTT, 2.3 mM magnesium acetate, 10 
pig /ml rifampicin, 57 jiM each of dGTP, dATP and dCTP, and 21 jliM TTP) and the 
solutions were mixed to form the "polymerase reaction mixture." This mixture can be 
used immediately or stored for several months at -80°C .by flash-freezing in liquid N 2 . 

To test the stability of the polymerase reaction mixture at elevated 
temperatures, 15 \x\ aliquots of the reaction mixture were incubated at 55X, 65°C, or 
75 °C, for various periods of time varying between 1 and 40 min. The reaction tubes 
were then immediately placed on ice and allowed to cool to 0°C. E. coli DNA 
polymerase III holoenzyme (Enzyco) was then added and the reaction tubes incubated 
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for 5 min at 30°C before measuring the amount of double-stranded (ds) DNA 
synthesized in the reaction. 

The amount of ds DNA synthesized was measured using a fluorometric assay 
that relies on the properties of the fluorophor PicoGreen™ (Molecular Probes). This 
5 dye displays significant fluorescence in the presence only of ds DNA and is almost 
completely non-fluorescent in the presence of single stranded DNA. Two hundred 
fifty microliters of a 1:400 dilution of PicoGreen™ fluorophor in Tris-HCl (pH 7.5), 1 
mM EDTA was added to each reaction tube. The fluorescence intensity was measured 
using an SLT fluorescent plate reader (Tecan) using a 385 run filter for excitation and 

10 a 535 nm filter for emission. 

The results of these polymerase assays are summarized in Figure 1 . In Figure 
1 , the percentage of polymerase activity supported by the assay mixture was plotted 
against the incubation time in minutes. In this Figure, the time points indicated 
represent the length of time that the reaction mixture was incubated at the elevated 

15 temperature, before chilling, adding holoenzyme, and incubating for 5 minutes at 
30°C. The solid squares, the solid triangles and the open circles represent assays 
conducted at 75°C, 65°C and 55°C, respectively. 

The data shown in Figure 1 demonstrate that one or more components of the 
assay mixture were inactivated at the highest temperature tested. For example, at 75°C 

20 complete inactivation occurred in less than 1 min. Thus, this system was unsuitable for 
the measurement of DNA polymerase activity at 75°C. 

B. Development of a Gap-Filling Assay for DNA Polymerase 

Because the polymerase reaction mixture in the M13Gori assay was found to be 
25 unstable at elevated temperatures, the DNA gap-filling assay was used to jnonitor 

polymerase activity during purification, although the DNA gap-filling assay does not 
discriminate between the different types of polymerases (e.g., polymerase I and III). 
The majority (>90%) of gap-filling activity in bacterial cells is due to the presence of 
DNA polymerase I, making it difficult to identify other polymerases, including DNA 
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polymerase III. Nonetheless, as discussed below, this assay was found to be suitable 
to detect T. thermophilics pol III activity during the purification procedures described 
in the following Examples. 

The gap-filling assay was performed as follows. An assay mixture containing 
32 mM HEPES (pH 7.5), 13% glycerol, 0.01% Nonidet P40, 0.13 mg/ml BSA, 10 
mM MgCl 2 , 0.2 mg/ml activated calf-thymus DNA (Enzyco), 57 \iM each of dGTP, 
dATP and dCTP, and 21 |iM [ 3 H]TTP (360 cpm/pmol) was assembled. 

The reaction was started by the addition of a 0.5 |il of a suitable dilution of 
DNA polymerase to 15 \x\ of reaction mix, and incubated at 37°C for 5-50 min. The 
reaction was then stopped by placing the reaction tube on ice. 

The amount of DNA synthesized in the assay was measured by first 
precipitating the polynucleotide with 2 drops of 0.2 M inorganic pyrophosphate (PPi) 
and 0.5 ml 10% TCA. Removal of unincorporated nucleotide triphosphates was 
accomplished by filtering the mixture through GFC filters (Whatman) and washing the 
filters with 12 ml 0.2 M PPi/lM HC1 and then ethanol. The filters were then allowed 
to dry before scintillation counting; Ecoscint-0 (National Diagnostics) was used as the 
scintillant. One unit of enzyme activity is defined as one picomole of TTP 
incorporated per min at 37°C. Positive controls containing K coli DNA pol III were 
included in the assay since this assay does not distinguish the activity of pol I,pPol II, 
and pol III. 

The gap-filling assay was selected instead of the long stranded template DNA 
polymerase assay to monitor the purification of DNA polymerase III activity from T 
thermophilics. In the following Examples, the assay was performed at 37 °C, in part 
for convenience, but also because the decrease in activity for pol III at lower 
temperatures is less than that of the major T. thermophilics DNA polymerase activity, . 
permitting more efficient detection of pol III because it comprised a greater percentage 
of total polymerase activity when assayed this way. One unit of activity is equivalent 
to 1 pmol total nucleotide synthesis per minute at 37 °C. 
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EXAMPLE 2 
Purification of a DNA Polymerase Multi-Subunit 
Complex From T. thermophilus 

This Example involved a) growth of T. thermophilus strain pMF48.kat cells, 
5 and b) large-scale purification of a DNA III polymerase multi-subunit complex from T. 
thermophilus. 

The MF48.kat strain (Lasa et al. 9 Microbiol., 6: 1555- 1564 [1992]) was used for 

the purification of a polymerase multi-subunit complex. Strain pMF48.kat is a T. 

thermophilus strain mutated to be deficient in the protective S-layer protein found on 
10 the outer coat. This mutation renders the strain susceptible to lysis by hen-egg white 

lysozyme. The lysis procedure of Cull and McHenry (Cull and McHenry, Meth. 

EnzymoL, 262:22-35 [1995]) for isolation of E. coli DNA polymerase III holoenzyme 

(i.e., the replicative enzyme of £. coli) requires a gentle lysis procedure using 

lysozyme and the addition of spermidine to precipitate the chromosomal DNA with the 
15 cellular debris. The use of the S-layer mutant of T. thermophilus was found to allow 

the use of a modification of the standard K coli gentle lysis procedure (Cull and 

McHenry supra), as described below. 

A complex containing at least three proteins (a, x, and y) having DNA 

polymerase activity as determined by the gap-filling assay was purified from T. 
20 thermophilus MF48.kat Partial amino acid sequencing of these T. thermophilus 

proteins revealed that these proteins share homology with subunits of the E. coli DNA 

polymerase III holoenzyme. 

A. Growth of T. thermophilus Strain pMF48.kat Cells 

25 MF48.kat cells were grown as previously described in (Lasa et a/.,-Molec. 

Microbiol., 6:1555-1564 [1992]), harvested, and the pellet stored at -80°C as glycerol 
stock until ready for use. DNA polymerase was purified from large scale fermentation 
cultures of MF48.kat cells. These steps are further elaborated in the following 
sections. 
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1. Preparation of Media for Fermentation of T. 
thermophilus Strain pMF48.kat 

r. thermophilus strain pMF48KAT (Lasa et al t 1992) was grown in 180 L 

batches with aeration in a 250 1 fermentor at 72 °C in a medium containing (per L): 

5 0.27 g ferric chloride hexahydrate, 0.294 g sodium citrate trisodium salt dihydrate, 

0.025 g calcium sulfate dihydrate, 0.20 g magnesium chloride hexahydrate, 0.53 g 

ammonium chloride, 8 g pancreatic digest of casein (DMV International), 4 g yeast 

extract (Ardamine Z, Champlain Industries), 4.0 g glucose (Cerelose 2001, food grade 

(Corn Products, International), 0.5 L-glutamic acid monosodium salt, 0.254 g sodium 

10 phosphate monobasic (dissolved in 1000/180 ml water), 1.5 g dipotassium phosphate 
(dissolved in 2000/1 80 ml water). Following the addition of the final component to 
the fermentation medium, the pH of the broth was adjusted to pH 7.3 at 72°C with 3N 
NaOH. Fermentation was conducted so as to avoid precipitation or turbidity, and 
achieve, growth oyer 1 .2 O.D. 600 , During preparation of the fermentation medium, 

15 adequate time (approximately 4 minutes) was allowed for medium constituents to go 
into solution prior to the addition of the next medium component while constantly 
stirring. 

2. Growth Conditions 

20 Seed cultures were grown from frozen glycerol stock as follows and stored at 

-80°C. One ml of frozen glycerol stock was added to 250 ml fermentation medium 
prepared as described above and additionally containing kanamycin (30 jig/ml) at 72°C 
in a baffled Fernbach flask with 45 mm screw top closure, and the culture incubated 
overnight in an environmental shaker (New Brunswick) with low agitation (-100 rpm). 

25 At approximately 20 hours, the resulting seed culture was transferred to 180 L of 

fermentation medium in a fermenter (IF250 New Brunswick Scientific). Fermentation ' 
was carried out at 72°C, pH 7.3, 100% 0 2 (measured with Ingold O z Transmitter 4300 
with Ingold probes blanked on non-sparged, non-agitated medium, and at a pressure of 
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3 pounds per square inch, as determined by the pressure gauge on the fermenter) over 
atmospheric pressure with air flow at 40 L/min and rpm at 80. 

A 5 L sugar mixture consisting of 480 g/L glucose and 60 g/L L-glutamic acid 
was added during the run. The sugar mixture was prepared by incrementally adding 
5 and dissolving the sugar and glutamic acid in hot, sterile deionized water. One liter of 
the sugar mixture was added at OD 600 0.7-0.9, followed by the addition of 1 L of sugar 
mixture at each doubling thereafter (/.<?., at the rate of approximately 50-100 ml sugar 
solution per minute). At the time of transfer of the seed culture from the flask to the 
large fermenter, the culture was mixed with aeration (sparge) only in the absence of 

10 agitation. Agitation was begun at OD 600 0.7- 0.9. As the culture density increased 
and the available oxygen approached zero (measured with Ingold 0 2 Transmitter 4300 
with Ingold probes blanked on non-sparged, non-agitated medium), the agitation was 
increased to keep the detectable 0 2 levels just above zero. 

pH control was not used during the initial static growth phase until OD^ 

15 reached approximately 0.7-0.9, at which time pH was controlled with 3N sodium 

hydroxide to maintain a pH of 7.3 at 72°C. The pH was measured with an Ingold pH 
controller/transmitter 2500 with Ingold probes. Foaming was controlled mechanically. 

Cells were harvested at OD 600 2.5-3.0 by transferring the culture via a hose 
through an Alfa-Laval plate heat exchanger (72°C culture: ~ 10°C water exchange 

20 medium) to a Sharpies A5 16 continuous flow centrifuge. A cell paste was obtained 
by centrifugation in a Sharpies A5 16 continuous flow centrifuge, for approximately 1 
hour at 17,000 rpm (13,000xg) at room temperature. The cell paste was resuspended 
1:1 (w/w) in Buffer TS (50 mM Tris-HCl, 10% sucrose (pH 7.5) and then frozen in 
liquid nitrogen as pellets and stored at -20°C. 

25 - 

B. Partial Purification of T. thermophilus DNA Polymerase III 
and Associated Proteins 

To avoid formation of a precipitate which was observed when protein solutions 
were kept at 4°C, all of the following operations for large scale purification were 
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performed at room temperature unless otherwise stated. Cells were lysed with 
lysozyme, and an ammonium sulfate fraction collected. The ammonium sulfate 
fraction was first resolved by cation exchange chromatography, followed by 
hydrophobic chromatography. These steps are described below. 
5 ' ' - • 

Cell Lysis and Ammonium Sulfate Fractionation 

First, 4.7 kg of a 1:1 suspension of cells in Tris-sucrose were added to 6.5 1 
tris-sucrose that had been prewarmed to 55 °C. To the stirred mixture, 117 ml of 0.5 
M DTT, and 590 ml of 2M NaCl, 0.3M spermidine in Tris-sucrose adjusted to pH 7.5, 
10 were added. The pH of the slurry was adjusted to pH 8 by the addition of 2 M Tris 
' base, and 2.35 g lysozyme was added: The slurry was distributed into 250 ml 
centrifuge bottles and incubated at 30 °G for 1 hour with occasional inversion, and 
then centrifuged at 23,000 x g for 60 min. at 4 °C. The recovered supernatant (8 1) 
constituted Fraction I (Table 1 ). ; •• * ; . " ' * 

15 TABLE 1 

Partial Purification of T. thermophilus Pol III and Associated Proteins 



Fraction 


Method of Purification 


Units 
(x 10 3 ) 


Protein" 
(mg) 


Spec, Act. 
(Units/mg) 


I. : 


Cell lysis 


ND' 


73,800 


ND 


II. 


Amm. sulfate precipitation 


2,200 


6,540 


360 


Ill 


BioRex-70 chromatography 


6,700 


686 


10,000 


IV. 


ToyoPearl ether chromatography 


1,400 


54 


25,000 


V. 


Q-Sepharose chromatography 


760 c 


4.8 


160,000 b 



Fraction I was not assayed due to a non-linear response, presumably due to nuclease or 
inhibitory contaminants. 



25 b Assays were conducted at 37°G. The specific activity is 5-fold higher (800,000 units/mg) if 

assayed at 60°G. 

c The reported yield is normalized to a complete preparation using all of the material from 

Fraction IV. Only 50% of Fraction IV was used in the preparation of Fraction V. 
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To FrI, ammonium sulfate (0.267 g/ initial ml Fraction I) was added over a 15 
min interval. The mixture was stirred for an additional 30 min at 4 °C and then 
centrifuged at 23,000 x g for 60 min at 0 °C. The recovered pellet was resupended 
(on ice) in 563 ml (0.07 x Fraction I volume) 50 mM Tris-HCl (pH 7.5), 20 % 

5 glycerol, 1 mM EDTA, 0.1 M NaCl, 5 mM DTT, 0.18 g (added to each ml of final 
solution) ammonium sulfate and the resulting suspension centrifuged at 23,000 x g for 
60 min at 0°C resulting in Fraction II (Table 1). 

The precipitate was thoroughly resuspended on ice in 563 ml (0.07 x vol. of 
Fraction I) of backwash solution (50 mM Tris-HCl (pH 7.5), 20% glycerol, 1 mM 

10 EDTA, 0.1 M NaCl, 5 mM DTT, 0.18 g/ initial ml ammonium sulfate [i.e., 0.18 g 
ammonium sulfate were added for each 1 ml of initial solution volume, and 
centrifuged at 12,000 rpm in a GSA rotor for 60 min at 0°C. The pellet (Fraction II) 
was stored at 4°C. 

15 C. Identification Of Monoclonal Antibodies Cross-Reactive With The a 
Subunit Of T. thermophilics DNA Polymerase III Using ELISA 

1. Generation of Monoclonal Antibodies 

Monoclonal antibodies were generated by the University of Colorado Cancer 
20 Center Monoclonal Antibody Core. Standard procedures were used as described (See 
e.g., Oi and Herzenberg, in Mishell and Shiigi (eds.), Selected Methods in Cellular 
Immunology, W.H. Freeman & Co., San Francisco [1980], p 351-371). Briefly, 100 
jag of DNA polymerase III a subunit (Enzyco) in complete Freund's Adjuvant was 
injected subcutaneously into the back of the neck of a BALB/c Bailey mouse (Jackson 
25 Labs). Four weeks later, each mouse was boosted intraperitoneally with 50ug in 

Incomplete Freund's Adjuvant (IFA). Each animal was boosted again two weeks later 
with the same injection. Two weeks later, mice were bled from the tail and their 
polyclonal titer against a determined. The mouse giving the best response was 
selected for future sacrifice, was and given an additional 20 jag booster (without 
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adjuvant). Three days later, the mouse was sacrificed, the spleen aseptically removed 
and processed by mincing into several 2-3 mm pieces. The spleen pieces were bathed . 
in RPMI-serum free medium (1 10 L package RPMI Powder (Irvine), 20 g sodium 
bicarbonate, 43 g HEPES, 100 ml non essential amino acids (NE A A, Irvine), 1.1 g 
5 sodium pyruvate, 2.92 g L-Glutamine, 50 ml Pen-Strep (Irvine) adjusted to pH 7.05, in 
10 L deionized H 2 0, and filtered through a Gelman filter. Cells were transferred to at 
50 ml conical tube, pelleted and washed in RPMI-serum free medium. Cells were 
manipulated to remove red blood cells and centrifuged for 5 min at 1400 rpm. 
Remaining red blood cells were removed by resuspending the spleen cell pellet in 10 

10 ml of 0.83% NH 4 C1 at room temperature for 90 seconds. After 90 sec, 20 ml of 

RPMI-serum free medium was added, the cells were swirled to mix, and centrifuged at 
1400 rpm for 5 min. The cell pellet was washed with RPMI-serum free medium, 
centrifuged (5 min, 1400 rpm) and resuspended in 20.3 ml RPMI-serum free medium. 
Spleen cells and THT myeloma cells (Fox-NY Mouse Myeloma, ATCC 1732- 

15 CRL), cultured in RPMI with 10% FBS (Hyclorie Lot 2334), and 15 ^g/ml 8- 

azaguanine (pH 7.1-7.2) under 7% C0 2 at 37 °C), were mixed at a 4:1 ratio in a 50 ml 
conical tube. Mixed cells were centrifuged 10 min at 1400 rpm, and the supernatant 
was aspirated. PEG (50%) at room temperature was added drop-wide over 1 minute 
with gentle stirring. Approximately 1 ml was added to 2 x 10 8 cells. The PEG was 

20 then diluted by drop- wise addition of 9 ml RPMI-serum free medium. Cells were 
centrifuged at 1400 rpm for 10 min. „• 
The supernatant from the centrifuged cells was aspirated and the cells were 
resuspended in 10 ml of RPMI containing 15% FBS. Then, 10 ml of cells were plated 
per 100 mm Petri dish, and the cells were incubated overnight at 37 °C in 7 % C0 2 . 

25 Cells were then harvested, collected by centrifugation and resuspended in 205 ml 
RPMI containjhg 15 % FBS + AHAT medium (75 (iM adenine, 0.1 mM " 
hypoxanthine, 0.4 jiM aminopterin, 16 jaM thymine) in a T75 flask. In addition, 2 
drops of this suspension were plated into the wells of a 96-well CoStar dish. This 
procedure was continued to generate twenty 96 well dishes. These were incubated at 
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37 °C in 7% C0 2 . On the third day after plating 2 drops/well of fresh RPM1 
containing 15% FBS and AHAT, as well as lx Hybridoma Cloning Supplement 
(Boehringer-Mannheim) medium were added, with fluid changes every three days 
thereafter by aspirating l A of the medium from each well and feeding 2 drops/well of 
5 fresh RPMI with 15% FBS and AHAT. Screening was performed when the cells 
reached 1/3 confluency (ca. 10 days post fusion). 

2. Monoclonal Antibody Screening 

Screening was performed by an ELISA procedure. Twenty 96-well dishes 
10 (Dynatech Laboratories) were coated with 50 p.l/well of a 1 ng/ml solution of the a 

subunit of the E. coli DNA polymerase III holoenzyme diluted in PBS (8 g/1 NaCl, 0.2 
g/1 KC1, 1.15 g/1 Na 2 HPO«, 0.2 g/1 KH 2 P0 4 (pH 7.4)). Plates were washed (3x) in 
PBS with 0.1%Tween 20, and blocked by the addition of 200 ul/well of 1 % BSA in 
PBS. After washing, hybridoma supernatant (50 uJ/well) was added from the 
15 corresponding well of the 96-well culture dish. After washing, Fc specific goat anti- 
mouse IgG— horse radish peroxidase conjugated (Organon) (100 ul/well) antibody 
(diluted 1:5000 in 1% BSA in PBS) was added, and incubated for 3 hours at room n 
temperature. Plates were washed and exposed to 100 ul substrate solution and 
incubated for 15 minutes in the dark. Substrate solution was prepared by adding 150.. 
20 ml citric acid buffer (10.2 g/1 citric acid monohydrate, 26.8 g/1 sodium phosphate - 
dibasic, pH adjusted to 4.9 with phosphoric acid) to 60 mg O-phenylenediamine 
(Sigma). Just prior to use, 60 pj of 30% hydrogen peroxide was added. Reactions 
were quenched by the addition of 50 u.1 sulfuric acid/well, and read at 490 nm. 

From the preliminary screening, 24 wells appeared positive (O.D. 490 > 1 .0 
25 Control wells containing only diluent (i.e., 1% BSA in PBS); growth medium and 

medium containing AHAT and myeloma cells only, gave an O.D. of 0.2 or less. Cells 
from 24 wells that produced monoclonal antibodies against the a subunit of DNA 
polymerase III holoenzyme were expanded in a well of a 24 well plate and tested 
again by ELISA for continued secretion of antibody against a. Three wells failed to 
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show continued growth or secretion. The remaining candidates were also screened by 
a Western blot procedure with a mixture of the DNA polymerase III holoenzyme 
subunits (as described under section describing following T. thermophilus pol III by 
Western on hydrophobic column except 0.5 ^g of each of the K.coli DNA polymerase 
III holoenzyme subunits were loaded in a mixture and blotted against undiluted 
serum). 

From the 21 remaining candidates, 12 wells were selected that reacted strongly 
and specifically with a on Western Blots for cloning by limiting dilution (antibodies # 
178, 210, 257, 279, 645, 889, 1018, 1 104, 1 171, 1283, 1950, 1976). The remaining 
polyclonal wells were frozen down by the procedure described below. The antibodies 
produced were isotyped using an ELISA and isotype-specific antibodies. The two 
antibodies used in this study, 1950 and 1 104 were both IgG,... . 

For cloning by the limiting dilution technique, cells were removed from the 
chosen positive wells and placed in a 15 ml conical tube,— An aliquot of 600,000 cells 
was taken and added to a tube containing RPMI-serum free media to bring the final 
cell density to 60,000 cells/ml. Cells were diluted 1:10 successively with RPMI-serum 
free media until a dilution of 600 cells/ml was achieved. Cells were then diluted with 
HCS (Hybridoma Cloning Supplement, Boehringer Mannheim) until concentrations of 
50, 10 and 5 cells/ml were achieved. These were then aliquoted into wells of a 96 
well plate (2 drops/well), incubated in a 37 °C incubator and visually scored after five 
days in an inverted microscope, by noting the wells that appear to have colonies that 
arose from a single cell (one colony/well). These wells were screened by ELISA to 
identify clones that are producing antibody directed against a. 

The 12 candidate positive clones were expanded into 24 well plates and 
recreened by ELISA. These were expanded further into duplicate wells in a 24 well 
plate and then grown in 100 mm dishes and supernatant was collected. Cells were 
frozen away in 2 x 10 6 cell aliquots in 95% FBS + 5% DMSO) and stored in a liquid 
nitrogen freezer. 
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3. ELISA Assay of an Ammonium Sulfate Fraction 

In general, large asymmetric complexes are relatively insoluble in ammonium 
sulfate. Ammonium sulfate fractionation provides a 50-fold purification of the E. coli 
pol III holoenzyme. A low ammonium sulfate cut of T. thermophilus extracts was 

5 used to provide a source of protein enriched sufficiently in pol III that an ELISA assay 
could be used to screen 12 monoclonal antibodies directed against the E. coli pol III a 
subunit to determine if they cross-reacted with a T. thermophilus protein. The 
ammonium sulfate fraction was prepared by addition of 0.246 g ammonium sulfate to 
each ml of Fraction I using the same approach as described for the preparation of 

10 Fraction II in the pol III preparation described under Methods. For the ELISA 

screening assay, all manipulations were conducted at room temperature. Into each well 
of a 96- well microliter plate (Corning Costar High Binding EIA/RIA) was placed 4 ug 
protein in 1 50 jal Buffer E7 (lOmM Tris-HCl (pH 7.5), 150 mM NaCl, 0.05% Tween- 
20). After an overnight incubation, each well was blocked by incubation with buffer 

15 E7 ■+ 10 ug/ml BSA for 3 h, followed by 3 washes with Buffer E7 containing 10 
mg/ml BSA and the addition of 150 ul of hybridoma supernatant, incubated for 3.5 
hours. Wells were then washed 3 times with Buffer E7 and once with an equivalent 
buffer that had been adjusted to pH 8.8 (Buffer E8). Then, 150 ul of a 1:3000 
dilution of goat anti-mouse IgG antibody-alkaline phosphatase conjugate (BioRad) was 

20 added and incubated for 1 hour. Wells were washed 3 times with Buffer E7, once 

with Buffer E8 and developed in the presence of p-nitrophenyl phosphate for 10 min. 
Absorbance was read at 405. Monoclonal antibodies produced from hybridoma lines 
C1950-F3 and CI 104-H2 (Enzyco) gave an absorbance of 0.17 and 0.16 respectively; 
on average, all other candidate monoclonal supernatants gave an absorbance of 0.06. 

25 Equal volumes of both supernatants were combined for future work. 

In addition, 40 ug of the 0.246 ammonium sulfate fraction of T. thermophilus 
FrI was subjected to the Western Blotting procedure described below. A band 
migrating with the same mobility as the a subunit of pol III of E. coli was detected. 
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These data demonstrated that Fraction II of T. thermophilus contained an a 
subunit of DNA polymerase III, and that a mixture of monoclonal antibodies from 
hybridoma cell lines CI 950-F3 and CI 104-H2 is capable of detecting the T 
thermophilus DNA polymerase III a subunit. These data also demonstrated that the 
5 ammonium sulfate precipitations were optimized to provide a nearly quantitative 

precipitation of the candidate T. thermophilus a subunit (as judged by Western blots) 
while removing as much contaminating protein as possible (Fraction II; Table 1). 

D. Cation Exchange Chromatography 

10 Having established that a roughly 1 30 kDa protein cross-reacted with anti-£. 

coli a monoclonals, a lysis procedure and ammonium sulfate fractionation that 

partially purified the a subunit were optimized. 

A BioRex 70 cation exchange chromatography step was developed that resolved 

polymerase activity from the majority of contaminating protein. Fraction II was 
15 resuspended in 100 ml buffer U (50 mM imidazole-HCl (pH 6.8), 20% glycerol, 35 

mM ammonium sulfate, 1 mM magnesium acetate, 0.1 mM zinc sulfate, 5 mM p- 

mercaptoethanol, 0.1 mM ATP) and dialyzed twice successively versus 2 L buffer U. 

Dialysate was applied to a 300 ml BioRex 70 (BioRad, 100-200 mesh, 5.5 cm 

diameter) column equilibrated in buffer U and washed with 0.9 1 buffer U. Activity 
20 was eluted with a 1.5 1 0-»300 mM NaCl gradient in buffer U. All gradient-eluted 

fractions containing greater than 20,000 units of gap-filling polymerase activity/ml 

were pooled, constituting Fraction III (Table 1). 

Figure 2 shows the column profile of protein, and DNA polymerase activity. 

Fractions 20-80 in Figure 2A which contained bound DNA polymerase activity were 
25 pooled to generate Fraction III. Ammonium sulfate solution saturated at 4^C was 

added to 50% (v/v), and centrifuged at 23,000 x g for 60 min at 4°C to obtain a 

precipitate. 

Although the BioRex-70 column separated a large amount of contaminating 
protein from the polymerase activity and resulted in enrichment of the polymerase 
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fraction, it did not resolve different polymerases, since it did not resolve the pol HI 
antibody reactive fraction from the majority of polymerase activity. Nevertheless, a 
25-fold increase in gap-filling polymerase specific activity was achieved (Fraction III, 
Table 1). Resolution of different DNA polymerase activities was achieved by 
5 hydrophobic interaction chromatography as described below. Figure 2A shows the 
results from one column run, while Figure 2B shows the results from another column 
run. Although the results are not completely reproducible, these Figures show that the 
DNA polymerase preparation was enriched by the BioRex-70 column, 

10 E. Hydrophobic Chromatography 

E. coli pol III holoenzyme binds tightly to hydrophobic columns (McHenry and 
Kornberg, J. Biol. Chem., 252:6478-6484 [1977]). Thus, hydrophobic interaction 
chromatography was attempted to resolve T. thermophilics pol III from the smaller and, 
presumably, more hydrophilic DNA polymerase Mike activity. 
15 Hydrophobic interaction chromatography was used to resolve a unique 

polymerase that cross-reacted with antibody directed against E. coli pol III a subunit. 
Fraction III protein was precipitated by the addition of an equal volume of saturated 
ammonium sulfate and the pellet was collected by centrifugation 23,000 x g, 1 h, 4 
°C). The pellet was dissolved in 40 ml buffer U, and 20 ml ToyoPearl-Ether 650M 
20 (Toso Haas) equilibrated in buffer U and 1.6 M ammonium sulfate was added. To the 
stirred suspension was added (dropwise) 33.6 ml 4 M ammonium sulfate and the entire 
mixture was applied to an 80 ml ToybPearl-Ether column (3.5 cm diameter) 
equilibrated in buffer U with 1.6 M ammonium sulfate. The column was washed with 
80 ml buffer U+1.1M ammonium sulfate and polymerase activity eluted with a 1.5 1 
25 l.l->0.3 M ammonium sulfate gradient in buffer U (Figure 3). Fractions containing 
polymerase activity were subjected to the Western Blotting procedure. The second 
peak (fractions 45-+50) that contained polymerase activity and reacted with 
monoclonal antibodies against E. coli pol III a subunit was pooled, constituting 
Fraction IV (120 ml, Table 1, Figure 3). Fractions were assayed for gap-filling 
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polymerase activity as described above and for protein content using the Coomassie 
Protein Assay Reagent (Pierce) with BSA used as a standard. 

Figure 3 shows a profile of the hydrophobic column chromatography. Protein 
(open circles) and gap-filling polymerase activity (filled squares) were plotted against 
5 fraction numbers. The gap-filling assay revealed 2 peaks of polymerase activity, with 
the first peak eluting between fraction numbers 20-30, and the second peak eluting 
between fraction numbers 45-50. The major T. thermophilus DNA polymerase peak 
eluted early in the gradient, and did not react with anti-a monoclonal antibodies. The 
. . second peak, a minor peak comprising approximately 9% of the polymerase activity, 
10 bound more tightly to the column (Fraction IV, Table 1). 

F. Western Blotting and Identification of Proteins Cross Reactive with 
Monoclonal Antibody Against 71 thermophilus DNA Polymerase HI 
Subunit a 

15 In order to determine whether the fractions which eluted from the hydrophobic 

chromatography column contained T. thermophilus DNA polymerase III a subunit, 
fractions containing DNA polymerase activity as determined by the gap filling assay 
(described in Example 1) were further analyzed by Western blotting using monoclonal 
antibodies which were specific for T. thermophilus DNA polymerase III a subunit 

20 (described above). The following steps were carried out at room temperature unless 
otherwise indicated. 

Aliquots of column fractions (50 jil) were subjected to electrophoresis on a 
10% SDS-PAGE. Proteins were then transferred from the gel to a PVDF membrane 
(BioRad). The transfer buffer contained 25 mM Tris, 192 mM glycine, and 20% 

25 methanol, and was adjusted to pH 8.5 by the addition of HC1. Transfer was conducted 
for 2 h at 70V (0.7 A). The membrane was then washed in 10 mM Tris»HCl (pH 
7.5), 150 mM NaCl, 0.05% Tween-20 followed by incubation for 1 h in the same 
buffer with 5% dried milk. The blot was then incubated overnight with C1950-F3, 
CI 104 H2 hybridoma supernatant (Enzyco), washed (3-times) in 100 ml buffer TBS 

30 containing 0.5% Tween-20, incubated for 1 hour with a 1:2000 dilution of goat anti- 

- 66 - 



BNSDOCID: <WO 9913060A1J_> 



WO 99/13060 



PCT/US98/18946 



mouse IgG alkaline phosphatase conjugate (BioRad) in buffer TBS containing 0.5% 
dried milk, washed (3-times) in 100 ml buffer TBS containing 0.5% Tween-20, and 
washed once in 100 ml buffer P (100 mM Tris-HCl (pH 9.5), 100 mM NaCl, 5 mM 
MgCl 2 ). The membrane was developed by incubating in a 1:150 dilution of a GIBCO 

5 nitroblue tetrazolium stock solution and a 1:300 dilution of a 5-bromo-4-chloro-3- 
indoyl phosphate stock solution in buffer B, until the a band reached the desired 
intensity. The development of color was stopped by washing the membrane in water. 

Figure 4 shows the results of the SDS-PAGE / Western blot analysis for 
fractions 41-52. Lane numbers 41-52 contained ToyoPearl-Ether column fractions 41- 

10 52. Western blot analysis showed that the second peak (fractions -45-50) contained 
protein molecules with a molecular weight of approximately 130 kDa which cross- 
reacted with antibodies from cell lines C1950-F3 and CI 104-H2 raised against the a 
subunit of E. cbli DN A polymerase III holoenzyme (Figure 4). 

Figure 4 shows that the cross-reactive 130 kDa band eluted with an intensity 

15 parallel to the polymerase activity in the second peak. This demonstrates that the 

second peak of activity contained a pol Ill-like polymerase, which is distinct from the 
characterized T. thermophilus DNA polymerase. This is in contrast to fractions 20-30 
in the first peak which did not result in detectable binding to E. coli DNA polymerase 
III antibody. Fractions 45-50 were used to generate Fraction IV (stored at 4°C). 

20 

G. Anion Exchange Chromatography 

To provide a higher level of purification, and to enable direct examination of 
the components, an additional chromatographic step was developed. One-half of 
Fraction IV was combined with an equal volume of saturated ammonium sulfate (4 °C) 
25 and the resulting precipitate collected by centrifugation and redissolved in 4 ml buffer 
U, and dialyzed overnight versus 250 ml buffer U (one buffer change). The dialysate 
was then applied to a 5 ml Q-Sepharose fast flow column (1.4 cm diameter) and 
equilibrated in buffer U. The column was washed with 15 ml buffer U, and the 
polymerase activity was eluted with a 75 ml 0->275 mM NaCl gradient in buffer U. 
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A contaminating protein was found to elute toward the end of the activity peak; thus, 
fractions were carefully selected for pooling based on purity. All fractions that 
contained greater than 50 jig protein and a specific activity greater than 104,000, were 
pooled (Fraction 28-36) to yield Fraction V (25 ml, Table 1, Figure 5). Fractions (1.5 
5 ml each) were collected and assayed for polymerase activity using the gap-filling assay ; 
described above, and for conductivity using a Radiometer CDM83 conductivity meter 
on samples diluted 1:100 in distilled water equilibrated to room temperature. 
Standards were also run in parallel. 

Figure 5 shows polymerase activity (open circles), protein (filled squares) and 

10 conductivity (filled triangles) across the column fractions. Chromatography on Q- 

Sepharose yielded a single peak of polymerase activity between fractions 28-40, and a 
6-fold increase in specific activity (Figure 5, Fraction V; Table I) . 

Samples (10 jil) of fractions 28-40 across the polymerase activity peak were, 
analyzed by SDS-PAGE (10%). The Coomassie blue-stained gel is shown in Figure 6. 

15 The migration position of the E, coli DNA polymerase III holoenzyme (Enzyco) 

subunits a (130 k), x (71.1 k) and p (40.5 k) are indicated with arrows on the right 
hand side of the gel (Figure 6)/ A 130 kDa protein eluted in parallel to the activity 
profile (Figure 6). A 43 kDa protein eluted later and did not parallel the eluted 
activity; this protein is presumably a contaminant. Two additional proteins of 63±6 

20 and 50±5 kDa eluted roughly in parallel with the 130 kDa protein, providing 
candidates for polymerase Ill-associated proteins, possible subunits of a 7V 
thermophilus pol III holoenzyme. The candidate a and two apparently comigrating 
proteins of 63 kDa and 50 kDa comprised approximately 50% of the protein resolved 
on an SDS gel. The 63 kDa protein chromatographed in parallel with the polymerase. 

25 The 50 kDa protein was more strongly represented toward the early portion of the 
peak. These data suggest that the 63 kDa and 50 kDa bands correspond to the T. 
thermophilus x and y subunits, respectively. 

Rigorous quantitation of the extent of purification of 7. thermophilus 
polymerase III was not possible since the gap-filling assay used was not specific for 
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the polymerase III holoenzyme. However, it is estimated that the T. thermophilus 
polymerase III was purified approximately 50,000-fold, assuming all of the DNA 
polymerase III activity was recovered in Fraction II and it represented 10% of the Fr 
II polymerase activity. The specific activity of purified T. thermophilus pol III in 
5 Fraction V was 800,000 units/mg at 60°C. The specific activity of pure E. coli pol III 
is known to be 2.5 x 10 6 at 30°C (Kim and McHenry, J. Biol. Chem., 271:20681- 
20689 [1996]). Thus, after correction for the mass of contaminants and the associated 
x and y proteins, the expected specific activity of T. thermophilus pol III core was 
estimated to be approximately 2 x 10 6 (/.e., close to the activity of its E. coli 
10 counterpart when compared at temperatures slightly below their optimal growth 
temperature). 

Fractions containing the highest polymerase specific activity (28-36) as 
determined from the gap-filling polymerase activity profile of Figure 5 were pooled 
(Fraction V) and solid ammonium sulfate was added to 75% (w/v). 

15 • 

H. Enzyme Activity of T. thermophilus Fraction V as 
a Function of Temperature 

In order to determine the stability of the T. thermophilus DNA polymerase III 

Fraction V, the DNA polymerase activity of the purified T, thermophilus DNA 

20 polymerase Fraction V was measured as a function of temperature. Polymerase 

activity was measured using the gap-filling assay described above with the exceptions 
that the incubation temperature was at selected temperatures between 37°C to 80°C 
and 0.14 \xg of Fraction V protein. Polymerase activity (pmol TTP incorporated in 5 
minutes) was plotted against the reaction temperature (Figure 7). The results shown in 

25 Figure 7 demonstrate that the gap-filling activity of the purified DNA polymerase 
complex (Fraction V) has a broad temperature optimum between 60-70°C. 
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EXAMPLE 3 

Amino Acid Sequencing of the Purified 7. thermophilus 
DNA Polymerase III x and 7 Proteins 

The x and y subunits of E. coli DNA polymerase III holoenzyme are both 
5 encoded by the same gene, dnaX. The y subunit (47.4 kDa) arises as a result of 
translational frameshifting which yields a truncated version of the x (71.0 kDa) 
subunit. Both proteins have identical amino acid sequences at the N-terminus. 

The SDS-PAGE analysis of the Q-Sepharose eluant (described in Example 2 
and Figure 6) showed that at least 2 other proteins of 63 kDa and 50 kDa, eluted 
10 together with the T. thermophilus a subunit. As with all other molecular weight 
values herein reported based on SDS-PAGE, these molecular weight values are 
approximate values. These molecular weights correspond closely to those of the x and 
y subunits of the E. coli DNA polymerase III holoenzyme. To confirm that the 63 
kDa and 50 kDa proteins which co-eluted with the T/ thermophilus a subunit 

1 5 corresponded to the x and y subunits of a DNA polymerase III holoenzyme, the 63 
and 50 kDa proteins were subjected to N-terminal amino and internal amino acid 
sequence analysis as follows. 

For N-terminal amino acid analysis, 120 ng of Fraction V was fractionated by 
SDS-PAGE (10%) and transferred to a Hyperbond PVDF membrane as described 

20 above. The protein bands at 50 kDa and 63 kDa were excised and subjected to N- 
terminal amino acid sequencing in an 477 Protein Sequence according to the 
manufacturer's (ABS, Inc.) instructions. The sequence (SEQ ID NO: l) of the 19 N- 
terminal amino acids of the 50 kDa T: thermophilus protein was identical to the first 
19 residues of the 20-amino acid sequence (SEQ ID NO:2) obtained for the 63 kDa 

25 protein {i.e., this is consistent with both being the product of the same geoe) {See, 

Figure 8). The sequenced regions were 67% identical to the homologous 1 8-residue E. 
coli DnaX sequence (SEQ ID NO:3), the only bacterial DnaX protein that has been 
directly characterized on the basis of activity, and 74 % identical to the homologous 
19-residue sequence within B. subtilis DnaX (SEQ ID NO:4). In Figure 8, 
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homologous regions are also indicated (SEQ ID NOS:5 and 6) between the T. 
thermophilus, E. coli, and B. subtilis sequences. 

The association of a T. thermophilus DnaX homolog with a protein that cross 
reacts with anti-pol III a monoclonal antibodies and which has the same molecular 

5 weight as the E. coli pol III a subunit in a 50% pure protein preparation establishes 
the existence of a multi-subunit pol III holoenzyme in T. thermophilus. Furthermore, 
the results demonstrate that the 63 and 50 kDa proteins are the T. thermophilus y and 
t subunits of pol III holoenzyme. 

In addition, the 63 kDa band was blotted and treated with Lys-C as described 

10 above and an internal peptide was sequenced to yield the sequence 

ARLLPLAQAHFGVEEVVLVLEGE (SEQ ID NO:88). This peptide did not exhibit 
detectable homology (/.<?., no hits were identified in a BLAST search of the entire NR 
database) with known DnaX sequences), but was important for confirming the 
correctness of the DNA sequence, of the Thermus thermophilus dnaX gene in 

15 subsequent experiments described below. 

EXAMPLE 4 

Cloning and Sequencing The T. Thermophilus dnaX Gene 
In order to determine whether the 63 kDa and 50 kDa proteins are products of 

20 the T. thermophilus dnaX gene, the structural genes were isolated by a reverse genetics 
approach using PCR to obtain a fragment of the gene. The gene fragment was used as 
a probe to obtain the full length of 7! thermophilus dnaX, enabling its full sequence to 
be determined. From the N-terminal sequence, a peptide was selected, FQEVVGQ 
(SEQ ID NO:89), that would provide a PCR primer with the least degeneracy with all 

25 possible codons represented (512-fold). To provide an internal primer, a KTLEEP 

(SEQ ID NO:90) amino acid sequence common to E. coli, B subtilis (Carter et al, J. 
Bacterid., 175:3812-3822 [1993]); and O'Donnell et al, Nuc. Acids Res. 21:1-3 
[1993]) and many other eubacteria was exploited. From this consensus sequence a 17- 
mer primer with 256-fold degeneracy was designed that included all possible 
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combinations of codons. A PCR fragment of the predicted size (388 nucleotides) was 
obtained. This fragment was close to the spacing of the corresponding regions in the 
E. coli dnaX gene (393 nucleotides). The PCR fragment was also highly homologous 
to dnaX genes in the fragment between the regions corresponding to the PCR primers, 
5 eliminating the bias imposed by primer selection. The PCR-generated probe led to the 
isolation to a full length gene- (Figure 9 A) that is highly homologous to other 
eubacterial dnaX genes. The deduced amino acid sequence was 42% identical to the 
corresponding segment of the B.subtilis dnaX gene. These steps are discussed in detail 
below. 

10, This example involved (A) Preparation of J. thermophilus genomic „DN A; (B) 

Isolation of T thermophilus dnaX probe by PCR; (C) Cloning PCR-amplified dnaX 
' probe; (D) Restriction enzyme digestion of T. thermophilus genomic DNA; (E) 
Southern blotting; (F) Cloning the 7.1 kb Pstl fragment; and (G) Colony hybridization 
and sequencing dnaX. - - ■- ^ 

15 

A, Preparation of T. Thermophilus Genomic DNA 

Thermus thermophilus strain pMAF.kat (obtained from J. Berenguer; See,Lase 
et al, [1992]) genomic DNA was prepared using previously described methods 
(Ausubel et al. 9 Current Protocols in Molecular Biology, John Wiley & Sons, New 

20 York NY [1995]). Briefly, T. thermophilus strain pMAF.kat was grown overnight at 
70°C from a single colony in 100 ml of Thermus rich media (8 g/1 Bacto-tryptone 
(Difco), 4 g/1 yeast extract (Difco), 3 g/1 NaCl (Fisher Scientific) (pH 7.5) with 
vigorous shaking in a New Brunswick Model G25 incubator. The culture was 
centrifuged in a 250 ml centrifuge bottle in a Sorvall GSA rotor at 6000 rpm (5858 x 

25 g) for 6 minutes, followed by resuspension of the pellet from a 100 ml pv£rnight 

culture in 9.5 ml TE buffer (pH 7.5) in an SS34 centrifuge tube (Sorvall). 0.5 ml 
10% SDS and 50 \xl of fungal proteinase K (20 mg/ml in dH 2 0) (GibcoBRL) was 
mixed with the suspension, the mixture was incubated at 37°C for ,1 h ? followed by 
addition of 2.35 ml of 4M NaCl (Fisher Scientific), and 1.6 ml of 
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cetyltrimethylammonium bromide (CTAB)/NaCl solution (10% CTAB in 0.7 M NaCl). 
The resulting mixture was incubated at 65 °C for 20 minutes followed by extraction 
with an equal volume of chloroform/isoamyl alcohol (30:1 mixture). The mixture was 
centrifuged in an SS34 rotor (Sorvall) for 10 minutes at 6000 x g, 0.6 volumes of 

5 isopropanol (Fisher) was added to the pellet to form a stringy, white precipitate. The 
stringy white precipitate was placed into 25 ml 75% ethanol, centrifuged for 5 minutes 
at 10,000 x g and the supernatant discarded. The pellet was resuspended in 4 ml of 
TE buffer (pH 7.5), the DNA concentration measured by determining the absorbance 
at a wavelength of 260 nm in a spectrophotometer, and the DNA concentration 

10 adjusted to 100 ug/ml with TE buffer (pH 7.5). 4.3 g of cesium chloride (CsCl) 

(Sigma cat. # C-4036) and 200 ul of ethidium bromide (10 mg/ml) were added per 4 
ml of TE buffer (pH 7.5) used to resuspend the DNA pellet. The mixture was 
centrifuged in a Sorvall T-1270 rotor in a Sorvall RC70B ultracentrifuge at 55,000 
rpm, at 1 5°C overnight. The chromosomal DNA band (highlighted with UV light) was 

15 removed, the ethidium bromide extracted by adding an equal volume of H 2 0 saturated 
butanol, mixing thoroughly by inverting the tube, and briefly centrifuging the tube (30 
seconds at 14,000 g) to separate the phases. The butanol extraction was repeated until 
pink color could not be detected, and then the extraction was repeated one. more time. 
The remaining DNA/cesium chloride/TE buffer (pH 7.5) mixture was dialyzed 

20 overnight in 500 x volume of TE buffer (pH 7.5) to remove the cesium chloride. 

DNA was precipitated by the addition of 1/10 volume of 2.5 M sodium acetate (pH 
adjusted to 5.2 with glacial acetic acid) and 1 volume of isopropanol, and 
centrifugation at 11,951 x g for 10 minutes. The pellet was washed with 75% ethanol 
and centrifuged at 11,951 x g for 2 minutes. The supernatant was removed, and the 

25 wash repeated. After the final wash, the supernatant was removed and the pellet 

allowed to dry until it became slightly translucent. The pellet was resuspended in 25 
ml of TE buffer (pH 7.5) by gentle rocking overnight at room temperature. DNA was 
quantitated by spectrophotometry by taking a spectrum of wavelengths from 220 nm to 
340 nm. Two preparations were obtained. Preparation A diluted 1/70 in TE buffer 
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gave an OD 260 of 0.175 consistent with a DNA concentration of 0.612 mg/ml using as 
a conversion factor A 260 =l-50 jxg/ml. Preparation B, with a DNA concentration of 
0.906; 

B. Isolation of T. thermophilus dnaX Probe by PCR 

An amino-terminal peptide sequence of T. thermophilus that showed homology 
to DnaX from E: colU B. subtilis, and H. influenza was used to design oligonucleotide 
primers which hybridized to the amino-terminal region of the putative T. thermophilus 
dndK. Two oligonucleotide primers (XI Fa & XlFb) were designed from the N- 
terminal peptide sequence to keep codon degeneracy at 512-fold! To design primers 
for regions located downstream from the N-terminal, conserved regions in DnaX were 
identified by comparing known sequences from bacteria and bacteriophage. These 
conserved regions were used to design primers that, with the amino terminal primer, 
could be used to amplify a portion of the T. thermophilus dnaX gene by PCR. 

A sequence which was homologous to DnaX from E. coli, B. subtilis, and H. 
influenza had the sequence Ser Ala Leu Tyr Arg Arg Phe Arg Pro Leu Thr Phe Gin 
Glu Val Val Gly Gin Glu (SEQ ID NO:102). The underlined. amino acids were 
chosen for designing an oligonucleotide primer in order to give a primer of sufficient 
length with minimal degeneracy. The N-terminal primers were 20-mers with 512-fold 
degeneracy, i.e., forward primer (XlFa) [5 --TTY CAR GAR GTN GTN.GGW CA-3' 
(SEQ ID NO:-21)] and forward primer (XI FB) [5 -TTY CAR GAR GTN GTN GGS 
CA-3' (SEQ ID NO:91)]. 

The conserved DnaX amino acid sequences for design of reverse primer X126R 
were as follows: 
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These above sequences correspond to SEQ ID NOS:92-96. The 17-mer, 128- 
fold degeneracy reverse primer (X126R) had the sequence 5' -ARC ATR TGN RCY 
f CR TC-3' (SEQ ID NO:97). 

The conserved DnaX amino acid sequences for design of reverse primer X139R 

5 were as follows: 



E coli S F N A L L K T L E E P P E H V 

B.subtilis A F N AL LK T LEEPPEHC 

F N A L L K T L E E P P E Y V 

K T L E E P 



H. influenzae A 
10 Consensus 



The above sequences correspond to SEQ ID NOS:98- 101. The 17-mer, 256- 
. fold degeneracy reverse primer (X139R) had the sequence 5'-GGY TCY TCN ARN 
GTY TT-3' (SEQ ID NO:22). 

1 5 For PCR reactions, the Boehringer Mannheim Expand™ long template PCR 

system was used and following the manufacturer's recommendations except as noted in 
the following. Reactions were conducted in Boehringer Expand™ buffer 1 
supplemented with 0.5 mM extra Mg ++ . The annealing steps were conducted at 48° : C, 
elongation at 68°C and the melting step at 94°C; 26 total cycles were run. Products 

20 were separated on 2% FMC Metaphor agarose gels, visualized by Sybr green- 1 
staining, extracted and cloned into vector pCRII (Invitrogen). 

Six ul of gel loading dye was added to each 30 ul PCR reaction described 
above. Thirty ul of this mixture was loaded into wells of a 2% Metaphor gel in lx 
TAE and subjected to gel electrophoresis at 250 volts for 4 hours. 

25 Using the dnaX primers XlFa and X139R, and T. thermophilus chromosomal 

DNA preparation A, approximately 10 bands were observed that ranged in size from 
350 bp to greater than 1500 bp. Half of these bands were of equal or greater intensity 
than a band observed at 392 bp as estimated from the gel. Subsequent cloning and 
sequencing (described belciw) of the 392 bp gel band showed that this band was 388 

30 bp. 
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Using dnaX primers XI Fa and X126R, approximately 18 bands were observed 
which ranged in size from 100 bp to greater than 1500 bp. Approximately half of the 
bands were of equal or greater size than the desired 357 bp band. A 329 bp band was 
selected for cloning as described below. 

\ With the dnaX primers XlFb and XI 26 R, approximately 15 bands were 
observed. These included a band that migrated to the same position as the above 
described 329 bp product from XI Fa and XI 26R. 

With the XlFb and X139R primers, approximately 20 bands were observed, 
with one band at approximately 420 bp appearing near the desired product. However, 
since the 392 bp product from the XI Fa and X139R primer pair appeared more 
promising, the 420 bp product was not further pursued. 

A plasmid isolate (pMGC/lFA-139A; DMSO 1327) containing an EcoRl 
fragment of the predicted size based upon the distance between the homologous region 
of the £. coli dnaX gene (approximately 393 bases) was sequenced (Figure 10). In 
Figure 10A, the sequences corresponding to primers are underlined. The DNA 
sequence shown in Figure i OA is the complement of the message strand. The 
sequence shown is of relatively low quality -(/.e. f approximately 90% estimated . 
accuracy), yet a BLAST search revealed strong homology to bacterial dnaX genes. B 
subtilis (shown in the lower line of the alignment of Figure 10B), showed 42% 
identity over a 71 amino acid stretch. In Figure 1 0B, the upper peptide sequence is T. 
thermophilus. For, B. subtilis the numbers refer to DnaX amino acid residues; 
numbers for T. thermophilus refer to the first base of the anticodon for the shown 
amino acid residue with base #1 being assigned to the C to the left of the 
corresponding primer sequence near the 3* end of the sequence shown (Figure 10A). 
In Figure 10, the string of "X"s indicate a defect in the printout from the reported NIH 
BLAST alignment. 
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C. Cloning PCR-Amplified dnaX Probe 

PCR-amplified DNA which contained regions encoding the x and y subunits of 
T. thermophilus DNA polymerase III was cloned into a modified pCRII vector 
(Invitrogen) which was used to transform DH5a E. coli, and the plasmids containing 
5 the PCR-amplified DNA were analyzed by DNA sequencing. These steps are 
described in more detail in the following sections. 

1. Cloning of PCR-Amplified DNA Into E. coli 

PCR-amplified sequences were extracted from gel slices prepared as described 

10 above using freeze-squeeze extraction. Extraction was performed by placing the gel 
slice in a sterile micro-spin device (0.45 mm pore size cellulose acetate filter (LSPI) 
that was then inserted into a 2.0 ml Eppendorf tube and the assembly was placed in a - 
80°C freezer for 10 minutes. The following steps were performed at room 
temperature. The frozen gel slice was then centrifuged in the micro-spin assembly for 

15 4 minutes at maximum rpm in a microcentrifuge (14,000 x g). The crushed gel slice 
was then resuspended in 100 u.1 of 2.5 M Na Acetate, re-frozen for 10 minutes at - 
80°C, and centrifuged as before. The liquid containing the gel slice DNA was 
collected in the bottom of the Eppendorf tube (approximately 300 and the DNA 
was precipitated by the addition of 1 ul glycogen (Boehringer Mannheim) and an 

20 equal volume (300 ul) of isopropanol. The precipitate mixture was centrifuged 

(14,000 x g) in a microcentrifuge for TO minutes, the supernatant discarded, and the. 
pellet washed with 75% ethanol. The ethanol wash supernatant was discarded, and the 
pellet was resuspended in 5 \i\ TE buffer (pH 7.5). The DNA was quantitated by 
adding 1 ul of DNA to 300 ul of a 1/400 dilution in TE buffer (pH 7.5) of 

25 PicoGreen™ (Molecular Probes) in a well of a 96-well microtiter plate (Life Sciences), 
and compared to known concentrations of Thermus thermophilus genomic DNA by 
excitation on an SLT Fluostar Microplate Fluorometer of the PicoGreen™ at 480 nm 
and measuring emission at 520 nm. 

- 77 - 



BNSDOCID: <WO 9913060A1_1_> 



WO 99/13060 




PCT/US98/18946 



DNA that PCR-amplified by the Expand High Fidelity enzyme mix had a 1 
nucleotide addition (dATP) that allows a compatible cohesive annealing to a vector 
- (pCRII vector, Invitrogen) with a 1 nucleotide T overhang. Fifty or 150 pmoles of 
PCR amplified DNA was added to 50 pmoles of vector DNA in a 10 jul reaction 
5 mixture containing 1 \x\ of 1 Ox Ligation Buffer (Invitrogen), 1 fxl of T4 DNA ligase (1 
Weiss unit/|il, GibcoBRL), and sterile dH 2 0 to bring the volume to 10 \xl. Ligation 
was carried out at 14°C overnight. The DNA in the ligation mixture was precipitated 
by the addition of 1 ^1 of glycogen (20 mg/ml, Molecular Biology Grade, Boehringer 
Mannheim), 1/10 volume (1 ^1) 2.5 M sodium acetate (pH 5.2), and 2.5 volumes of 
0 ethanol. The precipitate was recovered by centrifugation at d4,000 x g for 10 minutes, 
the supernatant was removed, and the pellet washed with 0.3 ml of 75% ethanol. The 
washed pellet was re-centrifuged at 14,000 x g for 2 minutes, the supernatant removed 
completely with a fine tipped pipet, and the pellet was resuspended in 5 |al of dH 2 0. 
* '- The DNA was used to transform electrocompetent DH5a E. eoli by electrbporation as 
5 described below, 

Electrocompetent DH5a E. coli cells were prepared by picking an isolated 
colony of DH5a (Gibco-BRL) and growing overnight in 10 ml SOP media. 1 ml of 
overnight culture was added to 500 mL SOP media, grown to an OD 600 reading of 0.6- 
0.8. The culture was placed immediately in an ice water bath to chill rapidly and left 
20 on ice for least 15 min. The following steps were carried out in the cold room at 4°C. 
Cells were centrifuged at 4000 rpm for 15 min, washed with 500 mL ice cold dH 2 0 
and allowed to sit on ice for 30 min. before spinning down in 500 mL centrifuge 
bottles at 4000 rpm for 15 min. The pellet was washed with 500 ml ice cold dH 2 0, 
spun down in 500 mL centrifuge bottles at 4000 rpm for 1 5 min. and the pellet 
25 resuspended in 20 mL 10% glycerol. The cells were spun down in a 50 mL tube at 
5000 rpm for 10 min , resuspended in approximately 0.5 mL 10% glycerol, and 0.2 
mL aliquots of competent cell preparation frozen quickly in 1.7 mL centrifuge tubes 
by placing in liquid nitrogen, then storing at -80^. 
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For electroporation, electrocompetent cells were transformed as follows. Cells 
were removed from the -80°C freezer and thawed on ice. 40 ^1 of electrocompetent 
cells were placed into a 1.5 ml Eppendorf tube (pre-chilled for at least 5 minutes on 
ice) and 2 \xl of the dH 2 0 re-suspended ligated DNA (ligated DNA + 5 |al H 2 0) was 
added and allowed to sit for 5 minutes. The transformation mixture was removed 
from the 1.5 ml Eppendorf tube and added to pre-chilled electroporation cuvettes (0.2 
cm electrode gap, BioRad). The electroporation was performed in an Eppendorf 
Electroporator 2510 at 2500 V. Then, 1 ml of SOP media was immediately added to 
the cuvette following electroporation, and the electroporated cells in SOP media were 
transferred to a capped tube and incubated for 1 h with shaking at 37°C. After 1 hour, 
aliquots of 2 x 200 \xl 9 1 x 50 jil, and 1 x 5 fxl were plated by spreading onto LB 
plates containing carbenicillin (50 ng/ml), and incubated overnight at 37°C in a 
convection heat incubator. 

2. Restriction Digest Analysis of Plasmids 

Plasmids were analyzed for the presence of insert by restriction digestion with 
EcoRl restriction endonuclease. EcoKl restriction sites flank insert DNA in the pCRII 
vector system. For EcoRI restriction digestion, 5 ^xl of Promega Plus Miniprep- 
isolated plasmid (from 50 jxl total) was added to a 10 jil reaction with 1 p.1 of 10 High 
Salt reaction buffer (lOx H buffer, Boehringer Mannheim), 1 jal of EcoRI restriction 
endonuclease (10 units/|al, GibcoBRL), and 3 ^il of dH 2 0. The mixture was incubated 
at 37°C for 2 hours. 2 |jtl of 6x gel loading dye was added, and all 12 \x\ of the 
reaction and dye were electrophoresed in 0.7 % SeaKem GTG agarose (FMC) in lx 
TAE for 4 hs at 80 V constant voltage. The presence of a gel band of the expected 
size confirmed the presence of the insert in the plasmid. Using primers XFlb and 
X136R, seven colonies were selected for further screening. Six had inserts of 
approximately the correct size. One (clone A) was selected for sequencing (DMSO 
1327). Six colonies were also obtained with the XlFa-X126R primer pair. The longer 
X139R-derived sequences were pursued since they were longer. 



WO 99/13060 




PCT/US98/18946 



3. Large Scale Plasmid Purification 

Large scale plasmid purification was performed in order to obtain enough 
DNA for sequencing, restriction digestion, etc: Large scale (i.e., 50 ml) plasmid 
purification was performed using Qiagen plasmid preps. An isolated £. coli colony 
5 was picked from an LB -carbenicillin (70 p.g/ml) plate and grown overnight in 50 ml 
of SOP media. The culture was centrifuged at 12,000 x g in a Sorvall SS34 rotor for 
10 minutes in a 50 ml polypropylene tube. The supernatant was discarded. The pellet 
was resuspended in 4 ml Buffer PI (50 mM Tris (pH 8.0), 10 mM EDTA, 100 |ig/ml 
Rnase A). 4 ml of Buffer P2 (200 mM NaOH, 1% SDS) was added, mixed gently by 

10 inverting the tube 4 to 6 times, and the mixture incubated at room temperature for 5 
min. 4 ml of chilled Buffer P3 (3.0 M potassium acetate (pH 5.5)) was. added, mixed 
immediately but gently, the mixture incubated on ice for 15 min., centrifuged at 
approximately 20,000 g for 30 min at 4°C and the supernatant removed promptly. The 
supernatant was centrifuged at approximately 20,000 x g for 15 min at 4°C, and the 

15 resulting supernatant promptly removed and applied to a Qiagen-tip 100 (Qiagen) 

which had been previously equilibrated by applying to the resin 4 ml of buffer QBT 
(750 mM NaCl, 50 mM MOPS (pH 7.0), 15% ethanol, 0.15% Triton X-100). The 
supernatant was allowed to flow through the resin under gravity. The Qiagen-tip 100 
was washed with 2 x 10 ml of Buffer QC (1.0M NaCl, 50 mM MOPS (pH 7.0), 15% 

20 ethanol), and the DNA eluted with 5 ml Buffer QF (1.25 M NaCl, 50 mM Tris-HCl 
(pH 8.5), 15% ethanol). The eluate was collected in a 12 ml centrifuge tube and the 
DNA precipitated with 0.7 volumes of isopropanol at room-temperature, and 
centrifuged immediately at approximately 15,000 x g for 30 min at 4°C. The DNA 
pellet was washed with 2 ml of 70% ethanol, air-dried for 5 min, and re-dissolved in 

25 500 jlxI of dH 2 0. DNA was quantitated by spectrophotometry over the optical range of 
220 nm to 340 nm by blanking against 700 jxl dH 2 0 followed by the addition 10 \x\ of 
DNA into the 700 |al dH 2 0 for the measurement. The ratio of OD at 260 nm to OD 
at 280 nm was between 1.8 and 2.0. Typically, 5 |ig was obtained from a 50 ml 
culture. The above procedure was used when plasmids were sequenced or used for 
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further cloning steps. The alternative DNA preparation method provided herein was 
used just for screening methods. 

D. Restriction Enzyme Digestion of T. thermophilus Genomic DNA 

5 Since a large probe that was identical to the gene had been prepared, a directed 

approach was used to clone the full length T. thermophilus dnaX rather than screening 
of a full library. A Southern blot against digested T. thermophilus DNA was 
performed in order to detect a restriction fragment that was large enough to contain the 
entire T. thermophilus dnaX gene. This fragment was extracted, cloned, and the 

10 resulting colonies were screened by colony hybridization and a candidate clone 
selected for sequencing. These steps are described as follows. 

Restriction endonuclease digests of Thermus thermophilus DNA were carried 
out in order to run the resulting digests on agarose gels and to blot for Southern 
analysis using PCR-ampIified fragments which had been cloned in E. coli as described 

15 above. Restriction digestion was carried out at 37°C overnight in a 100 jil reaction 

volume containing 20 \i\ of genomic DNA (18 jig preparation B chromosomal DNA); 
10 |il 1 OX restriction endonuclease buffer (NEB), 5 jal restriction endonuclease (25 to 
50 Units)(NEB), and 65 \i\ dH 2 0. An additional 5 \x\ of restriction endonuclease was 
then added and the mixture incubated at 37°C for an additional 2 hours. The digest 

20 was precipitated with 1 fxl glycogen (20mg/ml, Boehringer Mannheim), 1/10 volume 
2.5 M sodium acetate (pH 5.2), 3 volumes of ethanol and centrifuged for 30 minutes 
at 14,000 x g. The supernatant was removed and the pellet washed with 0.5 ml 75% 
ethanol prior to resuspeiision of the pellet in 40 jal TE buffer (pH 7.5). 

The prepared genomic DNA was run on agarose gels together with 60,000 cpm 

25 of 32 P end-labelled DNA Molecular Weight Marker VII (Boehringer Mannheim) and 

1.75 \x% (7 \xl ) of DNA Molecular Weight Marker VII (Boehringer Mannheim). End- 
labeled molecular weight markers were prepared by incubating an end-labeling reaction 
mixture [8 jil of DNA Molecular Weight Standard (2 ng, Boehringer Mannheim), 8 \xl 
dH 2 0, 3 ill 0.5M Tris (pH 7.5), 0.3 ^1 1M MgCl 2 , 0.3 jal 1M DTT, 1 (al (2 units/^il), 
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1.5 jxl 1.5 mM each of dCTP, dGTP,and dTTP, 8 ^1 32 P-dATP (80 jxCi, 3000 
Ci/mmole in 5 mM Tris-HCl (pH 7.5), for 1 hour at 37°C. To this mixture was added 
2 jxl glycogen (20 mg/ml, Boehringer Mannheim), 1/10 volume 2.5 M sodium acetate 
(pH 5.2) with glacial acetic acid, 3 volumes of ethanol and the resulting mixture was 
5 centrifuged in a microcentrifuge at 14,000 x g for 5 minutes, the supernatant discarded 
and the pellet washed with 400 \i\ of ethanol. The pellet was resuspend in 50 \xl TE 
buffer (pH 7.5), 1 |il was spotted onto a 24 mm diameter GF/C filter (Whatman), the 
filter paper dried, and counted with 5 ml scintillation fluid in a scintillation counter, 
yielding DN A of approximately 10 s cpm/^ig. 
10 To 10 jxl of digested genomic DNA (3 fig) was added 2 [il of gel loading dye 

and the mixture run on a 0.7% Seakem GTG (FMC) agarose gel in lx TAE for 14.2 
hours, at room temperature, at 35 V constant voltage. The gels were stained for 1 h 
with gentle rocking with Sybr green in IX TAE. 

15 E. Southern Blotting 

Southern blotting of T. thermophilus genomic DNA with PCR-amplified probes 
was performed by linking genomic DNA to a positively charged nylon membrane for 
colony and plaque hybridization (Boehringer Mannheim) using an alkaline transfer 
buffer as previously described (Ausubel FM et al. [1995] supra). Briefly, the gel was 

20 rinsed with distilled water, and the gel gently shaken with 10 gel volumes of 0.4 M 

NaOH on a platform shaker for 20 min. The positively charged nylon membrane was 
placed without prewetting directly onto the gel and alkaline transfer was carried out in 
0.4 M NaOH. : Since alkaline transfer is quicker than high-salt transfer, the blot could 
be taken apart any time after 2 h, but the blot was generally performed overnight. The 

25 membrane was rinsed in 2X SSPE, and allowed to air dry in order to ren\2ye agarose 
fragments that may adhere to the membrane and to neutralize the membrane. The 
membrane filters were baked at 80°C in a vacuum oven and were ready to use for 
hybridization with a labeled probe. 
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1. Preparation of Probe 

Probe DNA was prepared from vectors (pCRII, Invitrogen) containing genomic 
DNA segments which encode portions of subunits of the Thermus (hermophilus DNA 
polymerase III holoenzyme subunits, as described above. The gene segments were 
5 released from the vector by digestion of 10 ug of the cloned DNA's in a 300 fJ-1 

reaction with EcoRI (100 Units, Gibco BRL) in lx Boehringer Mannheim High Salt 
restriction digest buffer H at 37°C for 2 hours 60 ul of gel loading dye was added to 
each digest, the entire 360 ul digest was loaded into 260 x 1.5 mm wells, and the 
digests were electrophoresed in 4% NuSieve GTG agarose (FMC) in lx TAE buffer, 
10 and the gel was run at 100 V (constant voltage) for 4 hours. The gel was stained with 
300 ml Sybr green (Molecular Probes, diluted 1/10,000 in lx TAE) for 1 h with gentle 
rocking. Bands containing the gene fragments were excised with a clean scalpel, 
placed into a tared 4 ml Nunc tube (Nunc 4.5 ml Cryotube, NUNC), and the bands 
were weighed. The slices were melted by putting the tubes in a gravity convection 
15 incubator at 72°C for 30 minutes or until the slice was completely melted. Once 

melted the tubes were cooled to 45°C in a temperature block and 1/10 volume of lOx 
p-agarase buffer (Gibco BRL; 100 mM Bis-Tris (pH 6.5), 10 mM EDTA) was added, 
followed by the addition of 4 ul of (3-Agarase (GibcoBRL, 1 unit/ul). The mixture 
was incubated overnight in a gravity convection incubator at 42*C. To precipitate the 
20 DNA, 1/10 vol. of 3 M NaOAc (pH 5;2), was added and the mixture chilled on ice. 
Any remaining undigested agarose was removed by centrifugation for 2 minutes in 2 
ml Eppendorf tubes at 14,000 x g in an Eppendorf microcentrifuge. The supernatant 
was removed and the DNA precipitated by the addition of 1 ml glycogen (Boehringer 
Mannheim, molecular biology grade, 20 mg/ml) and 2.5 volumes of ethanol, followed 
25 by centrifugation at 14,000 x g in an Eppendorf microcentrifuge for 10 minutes. The 
supernatant was removed and the pellet washed with 0.5 ml of 75% ethanol and 
centrifuged at 14,000 x g in an Eppendorf microcentrifuge for 2 minutes. The pellets 
were resuspended in 10 ul of a 1/10 dilution of TE buffer (1 mM Tris (pH 7.5), 0.1 
mM EDTA final concentration). The DNA was quantitated by adding 1 ul of DNA to 
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300 |ll of a 1/400 dilution of PicoGreen ™ in a well of a 96-well microtiter plate (Life 
Sciences), and compared to known concentrations of Thermus ihermophilus genomic 
DNA by excitation of the PicoGreen™ at 480 nm and measuring emission at 520 nm, 
on an SLT Fliiostar Microplate Fluorometer. 
5 . ' ■■■ . : "... ..." 

2. Labelling of Probe 

The probe was labeled by adding 50 ng of DNA to a random hexamer priming 
reaction mixture (Random Primed DNA Labeling Kit, Boehringer Mannheim) 
following the manufacture's instructions. Briefly, 50 ng DNA in 10 |il.dH 2 0 was 

10 placed into a 0.2 ml thin-walled PCR tube, overlaid with 25 \x\ mineral oil and the 

DNA denatured by heating at 95°C for 10 minutes in a thermal cycler (MJ Research). 
At 9 minutes through the 10 minute denaturation, 2 jal of Boehringer Mannheim 
hexanucleotide reaction mix (tube 6 in kit) was added and the denaturation continued 
for an additional minute. Just prior to the end of the 10 minute denaturation, the 

15 mixture was rapidly cooled by removing the tube from the thermal cycler and 

immediately placing it in a beaker of ice-water. One |il each of dATP, dTTP, dGTP 
(0.5 mM in Tris buffer, final concentration 25 |iM) was added to the reaction, 
followed by 5 \x\ of a32 P-dCTP (50 >iCi, 3000 Ci/mmole, in 5 mM Tris-HCl (pH 7.5)), 
and 1 |iil of Rlenow DNA polymerase (Boehringer Mannheim, 2 units/|il in 50% 

20 glycerol), and the mixture was incubated at 37°C in a water bath for 1 h. 

3. Removing Unincorporated Label 

The unincorporated label was removed using the PGR Clean-up Kit (Boehringer 
Mannheim). Briefly, the 20 \i\ reaction was removed from the mineral oil overlay 
25 with a fine-tipped micropipette and the volume adjusted to 100 with Tp buffer (7.5) 
in a 0.5 ml Eppendorf tube. 400 1 of the nucleic acid binding buffer was added to the 
DNA along with 10 jj.1 of the silica suspension mix. The mixture was incubated for 
10 minutes at room temperature with frequent vortexing. The mixture was centrifuged 
for 30 seconds in a microcentrifuge and the supernatant discarded. The matrix 

- 84 - 



BNSDOCID: <WO 9913060A1J_> 



WO 99/13060 




PCT/US98/18946 



containing the DNA was resuspended with 400 |al of nucleic acid binding buffer by 
vortexing. The mixture was centrifuged and the supernatant discarded as before. The 
pellet was washed with 400 |il of washing buffer, centrifuged and the supernatant 
discarded as before. This step was repeated once more. After the final spin the pellet 

5 was re-centrifuged and any remnants of supernatant were removed with a fine-tipped 
micropipette.- The pellet was allowed to dry at room temperature for 15 minutes. The 
DNA was eluted from the silica matrix by the addition of TE buffer (10 mM Tris (pH 
8.4), 1 mM EDTA ) and incubation at 65°C for 10 minutes with occasional vortexing 
After centrifugation for 2 minutes at maximum speed in the microcentrifuge, the 

10 supernatant was removed to a clean tube and the silica matrix was again re-suspended 
in 100 nl dH 2 0. The elution procedure was repeated and the supernatant combined 
' with the first supernatant to give a final volume of 200 It was then determined 
that 2 x 10 7 cpm was incorporated. The estimated specific activity of the DNA is 10 8 " 9 
cpm/|j.g. 

15 

4. Hybridization 

Membranes with bound Thermus thermophilus DNA were blocked by 
incubation with 50 ml blocking fluid (50% formamide, 5x SSPE (20x SSPE comprises 
3 M NaCl, 200 mM NaH 2 P0 4 , 0.02 M Na^DTA adjusted to pH 7.4 with NaOH), 1% 

20 SDS, and a 1/20 dilution of milk solution (10% skim milk dissolved in dH 2 0, 0.2% 
sodium azide)) at 42°C for at least 2 h in a hybridization bag with spout (Boehringer 
Mannheim, cat#1666 649). Two \x\ of the labelled probe was spotted onto 2.4 cm 
diameter circle GFC filters, placed in 5ml of scintillation fluid in a scintillation tube, 
and counted in a scintillation counter. The specific activity of the probe should be 

25 around 1 x 10 9 cpm/|ag DNA. 

The blocking fluid was replaced with 20 ml of fresh blocking buffer fluid. 
The labeled probe (200 \x\ as described above in section 3; 1-3 x 10 7 cpm) was 
denatured by the addition of 1/5 volume (40 |jtl) of 0.2 N NaOH and the mixture was 
incubated at room temperature for 10 minutes.. The denatured probe was added to the 
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hybridization bag. Neutralization of this mixture to allow hybridization was 
accomplished by the addition of 1/5 volume (40 jal) of 2 M Tris-HCl (pH unadjusted). 
The hybridization mixture was incubated at 42*C with rocking overnight. The filters 
were washed at least 3x 100 ml each in the bag with 2x SSPE, 0.1% SDS, and placed 
5 in a pyrex dish with 500 ml of O.lx SSPE, 0.1% SDS pre-heated to 42°C for 30 
. minutes. This wash was repeated 1 more time and the filters. were air dried, and 
exposed to Kodak X-OMAT film or exposed to a phosphorimager plate (Molecular 
Dynamics) and viewed in a phosphorimager (Molecular Dynamics). 

Using as a probe the cloned fragment amplified with dnaX 1 Fa and XlFb 
10 primers, Southern blot. hybridization showed a 7.1 kb band with genomic DNA 

digested with Pstl.' The results of these experiments were 1) The probe hybridized to 
a 7.1 kb Pstl band made by digestion of (22.5 (ig of Preparation B chromosomal DNA 
with 10 yt Pstl (100 units), in 25 ^1 NEB Buffer 3 (NEB) and 190 jal H 2 0, overnight 
at 37°C; and an 8.5 kb band hybridized to the Hindlll digest (22.5|xg of Preparation B 
chromosomal DNA digested with 10 \il Hindlll (100 units) in 25 pi NEB Buffer 2 
(NEB) and 190 pi H 2 0, overnight at 37°C. 

F. Cloning The 7.1 kb Pstl Fragment 

Pstl-digests of T. thermophilus genomic DNA were prepared in order to clone 
large restriction fragments of a size which was expected to contain the full length T. 
thermophilus dnaX gene. The restriction fragments were cloned into a modified pGRII 
vector (pMGC 707), transformed into E. coli and cells containing plasmids that carried 
dnaX encoding sequences complementary to probe were identified by a colony 
hybridization procedure. 

I. Pstl-Restriction Enzyme Digestion of T. Thermophilus 
Genomic DNA To Prepare the Library 

A 250 jj,1 restriction digest reaction mixture was prepared by incubating 

overnight at 37°C 22.5 p.g genomic DNA (Preparation B, prepared as described above), 
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25 \il of 10 x restriction endonuclease buffer, 10 |J,1 PstI (10 units/|j.l), and 190 jal 
dH 2 0. Fifty jxl of gel loading dye was added to the digest, and the entire 300 p.1 
sample was loaded into a 1.5 mm thickness x 4.3 cm well in a 0.7 % SeaPlaque low 
melt agarose at 45V constant voltage overnight at 4°C with the DNA Molecular 
Weight Standards VII (Boehringer Mannheim). The dye front was run until it had 
reached 2/3 the length of the gel, the gel was stained in Sybr green in lx TAE for 1 h 
with gentle shaking. A gel slice that contained at least 2 mm on either side of the 
region corresponding to a 7.1 kb fragment was excised with a clean scalpel. 

The DNA was recovered from the gel by incubating the gel slice at 72°C in a 
convection heat incubator for 30 minutes or until the gel slice was completely melted. 
The melted gel slice was placed into a tube which was inserted into a 45°C 
temperature block and allowed to cool for 5 minutes. Then, 1/10 volume of 10 x p- 
agarase buffer was added along with 4 (il of p-agarase (1 unit/fil, GibcoBRL) per 200 
jil of volume. Digestion was carried out at 42°C overnight in a convection heat 
incubator and the sample was precipitated by adding 1 [il glycogen (20 mg/ml, 
Boehringer Mannheim), 1/10 volume of 2.5 M sodium acetate (pH 5.2) with glacial 
acetic acid, and 1 volume of isopropanol. The precipitate was centrifuged at 1 4,000 x 
g for 1 5 minutes and the supernatant discarded. The pellet was washed once with 1 
ml of 75% ethanol, the pellet centrifuged at 14,000 x g for 2 minutes and the 
supernatant discarded. The pellet was resuspended in 10 ill of millQ de-ionized H 2 0. 

2. Cloning dnaX PstI Fragment into pMGC707 

A 7.1 kb fragment that hybridized with the partial dnaX probe generated by 
PCR was extracted and cloned into vector pMGC707. pMGC707 was prepared by 
cutting pCRII (Invitrogen) with Spel and Notl and inserting a poly linker resulting from 
the annealing oligonucleotides 



(5 ' GGGCGC AATTGCACGCGTTCGAATTCC ATGACGTCTTCC AGTGC ACTGGTT 
. AATTA) (SEQ ID NO:103) and . 



- 87 - 



BNSDOCID: <WO 9913060A1_I_> 



^_ PCT/US98/18946 

WO 99/13060 

(5'CTAGTTAATTAACCAGTGCACTGGAAGACGTCATGGAATTCGAACGCGTG 
CAATTGC) (SEQ ID NO: 104). The polylinker region of the resulting plasmid was 
cleaved with BstXl to generate Pstl compatible termini to receive the extracted 7.1 kb 
population of fragments. As BstXl does not itself give ^/-compatible termini, this 
5 vector was designed so that it would give Pstl compatible termini. 

Vector pMGC707. was cut with BstXl, and gel purified to separate linear 
product from undigested circles. The cut pMGC707 was ligated with a Pstl restriction 
fragment of genomic digest prepared as described above, in a 20 ul Ligation Reaction 
[2 ^1 vector DNA (100 ng), 2 ul insert DNA (1000 ng), 2 |xl- of 1 Ox ligation buffer 
10 (Invitrogen), 1 ul Hgase (GibcoBRL, 3 units/ul), and 13 ul dH 2 0] by incubating for 1 
h at room temperature and overnight at 14°C. The reaction mixture was precipitated 
with 1 ul glycogen (20 mg/ml, Boehringer Mannheim), 1/10 volume of 2.5M sodium 
acetate (pH 5.2), and 3 volumes ethanol, and washed once with 1 ml 75% ethanol. 
The resulting plasmids were used to transform DH5a E... coli by. electroporation as 
15 described supra. Electroporated cells were incubated for 1 h prior to selection 

overnight on LB plates containing carbenicillin (50 ug/ml) at 37°C in a convection 
heat incubator. Transformed cells which grew on carbenicillin-containing plates were 
ready for colony hybridization to detect the presence of sequences encoding T. 
thermophilus DNA polymerase III holoenzyme subunits. (i.e., "DNA polymerase III 
20 holoenzyme" refers to the whole entity, while "DNA polymerase III" is just the core 
[a, e, 6]). 

G. Colony Hybridization And Sequencing of T. thermophilus dndX 

Colonies were screened by colony hybridization (Ausubel et al. [1995] supra) 
25 using the T. thermophilus dnaX PCR-generated probe. Plasmid from positive colonies 
were purified and those containing 7.1 kb inserts and also a BamUl site as indicated 
from the sequence of the PCR probe were retained for further characterization. One 
was submitted for full DNA sequencing (Lark Sequencing Technologies, Houston, 
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TX), resulting in the sequence of the full length gene. These steps are elaborated in 
detail below. 

1. Preparation of Replica Filters Containing 
5 Transformed Colonies 

Replica filters were prepared by filter-to-filter contact with a master filter. A 

master filter was prepared by placing a sterile dry nylon membrane filter for colony 

and plaque hybridization (Boehringer Mannheim) onto a one day-old LB agar plate 

containing carbenicillin (50 fig/ml; Gibco-BRL). The bacteria were applied in a small 

10 volume of liquid (<0.8 ml, containing up to 20,000 bacteria for a 137-mm filter; <0.4 
ml, containing up to 10,000 bacteria for an 82-mm filter), and spread over the surface 
of the filter leaving a border 2-3 mm wide at the edge of the filter free of bacteria. 
The plates were allowed to stand at room temperature until all of the liquid had been 
absorbed. The plates were inverted and incubated at 37°C until very small colonies 

15 (0.1 -mm diameter) appeared (about 8-10 hours) on the filter. 

Using sterile, blunt-ended forceps (e.g , Millipore forceps), the master filter was 
gently removed from the first plate and placed on a stack of sterile MC filter paper 
colony side up. The second, wetted filter was placed on top of the master filter, being 
careful not to move filters once contact has been made. The filters were pressed 

20 together using sterile blotting paper and by applying pressure to a 140 mm petri-dish 
lid placed on top of the sterile blotting paper. Before filters were peeled apart, 3 to 1 1 
holes were poked through the wedded filters with an 18 gauge hypodermic needle 
around the perimeter of the filters to aid in orienting the master and replica filters with 
respect to each other. The filters were gently peeled apart and the filter replica 

25 returned to an LB-agar-carbenicillin plate, colony side up, while the master filter was 
placed back onto an LB-agar-carbenicillin plate. 

The plates (containing master filter and replica filter) were incubated at 37°C 
until colonies 1-2 mm in diameter appeared. Colonies on the master plate reached the 
desired size more rapidly (6-8 hours). At this stage, while the bacteria were still 
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growing rapidly, the replica filters .were transferred to agar plates containing 
chloramphenicol (170-250 ug/ml) and incubated for a further 8 hours at 37°C. The 
chloramphenicol plate does not allow growth of the bacterial cell but the plasmid in 
the cell continues to replicate giving a better signal when the DNA from the colonies 
5 is hybridized. The master plates were sealed with parafilm and stored at 4°C in an 

inverted position until the results of the hybridization reaction were available. Bacteria 
on the replica filters were lysed and the liberated DNA bound to the replica filters for 
subsequent hybridization as follows. 

10 2 - Lysis And Binding Liberated DNA To Filters 

Using blunt-ended forceps (e g, Millipore forceps), the replica nylon membrane 
filter was peeled from the plate and placed for 3 minutes, colony side up, on MC filter 
paper impregnated with 10% SDS. The filter was transferred to a second sheet of MC 
paptr that had been saturated with denaturing solution (0.5 M NaOH, 1.5 M NaCl) and 
15 left for 5 minutes, then to a third sheet of MC paper that had been saturated with 

neutralizing solution (1.5 M NaCl, 0.5 M Tris . CI (pH 8.0)) and left for 5 minutes. 
The filter was allowed to dry, colony side up, at room temperature on a sheet of dry 
MC paper for 30-60 minutes. The dried filter was sandwiched between two sheets of 
dry MC paper and baked for 2 hours at 80°C in a vacuum oven prior to hybridization 
to a 32 P-labeled probe. Colony hybridization with a probe demonstrates that the colony 
contained plasmids with inserts which contained sequences of T. thermophilics DNA 
that were complementary to the probe. 

The hybridization was performed as described for Southern blots. One to two 
filters were placed in a hybridization bag. If two filters were placed in the bag, the 
filters were placed back-to-back with the filter surfaces that had not comejnto contact 
with the colonies against each other. After completion of the hybridization procedure, 
the filters were air-dried at least one hour, and the filters were marked with a small 
dab phosphorescent paint over each of the needle holes. When the filters are exposed 
to X-ray film, the phosphorescent paint produces a dark spot on the X-ray film that 
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can be used to align the X-ray image with the master filter to identify specific E. coli 
containing plasmids that hybridize to the probe. 

The hybridized filters were taped to a piece of Whatman 3 MM paper with 
Scotch brand Magic tape and the taped filters and Whatman 3MM paper covered with 
5 plastic wrap to prevent contamination of the film cassette. In the darkroom, the filters 
were placed in an X-ray film cassette with an intensifying screen and Kodak XR Omat 
X-ray film was placed over the top and the cassette sealed. The cassette was placed in 
a black garbage bag and sealed with tape to act as a secondary barrier to light 
exposure of the film. The bag was placed overnight at -80°C for exposure of the film. 
10 The film was developed by removing the film in the darkroom, and dipping the 

film into Kodak developing solution for 3 minutes, followed by a 23 minute soak in 
cool tap water, and finished by soaking for 3 minutes in fixer (all developing reagents 
used were Kodak). The film was then rinsed with tap water and allowed to air dry for 
15 to 30 minutes. 

15 To align the film with the master filter, the developed X-ray film was placed 

on a white light box and covered with Plastic wrap. The master filter was identified 
by the number and orientation of the black spots on the film that aligned with the 
holes poked in the master filter. The master filter was then carefully removed from 
the agar plate with sterile forceps and placed on the plastic wrap covered X-ray film to 

20 line up the darkened spots on the film with the holes in the master filter. Dark spots 
on the film were correlated to colonies on the filter, and colonies were picked and 
struck with a sterile toothpick to LB-agar plates containing carbenicillin at 150 ng/ml. 

Isolated colonies from each streak were grown and plasmid isolated as 
described previously (Promega). For dnaX, 4 strongly hybridizing colonies on one 

25 filter from the Hindlll 8.4 to 8.6 Kb fragment clones were chosen for plasmid 

preparations. Four colonies each from 3 separate filters from the Pstl 7.1 Kb fragment 
clones, (12 total Pstl clones) were selected for plasmid preparations. The plasmid 
DNA from Hindlll and Pstl clones was digested with BamWl restriction endonuclease 
as previously described, and only 1 colony, designated "pAX2-S (DMSO 1386)," 
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showed evidence of an insert. The BamHl digest was repeated on pAX2-S clone, and 
the restriction pattern gave bands that were roughly estimated against the VII DNA. 
molecular weight standards as 6 Kb,2 Kb, 1.9 Kb,. 0.8 Kb. The addition of these 
fragments gave a plasmid size of around 10.7 Kb, roughly the size of plasmid that 
5 would be expected from a 7.1 Kb insert into pMGC707. BamHl restriction 

endonuclease digests were used because sequence data from the dnaX partial clone 
obtained by PCR described previously indicated the presence of a BamHl she. This 
clone was verified to be dnaX by sequencing. 

10 3. Restriction Digestion of Clones 

In order to confirm the . presence of inserts in the plasmids which had been 
identified as "positives" by colony hybridization, the plasmids were purified and 
analyzed by restriction enzyme digestion. - ; 

15 a * Plasmid Purification 

Plasmid was prepared and purified from 1-3 ml bacterial culture. Plasmid, 
preparation was performed using a Promega Plus Miniprep technique (Promega). 
Briefly, 1-3 ml of bacterial cell culture was centrifuged for 1-2 minutes at 10,000 x g 
in a microcentrifuge, the supernatant poured off and the tube blotted upside-down on a 

20 paper towel to remove excess media. The cell pellet was completely resuspended in 
200^1 of Cell Resuspension Solution (Promega, A71 IE) (50mM Tris (pH 7.5); lOmM 
EDTA; 100ng/ml RNase A). 200^1 of Cell Lysis Solution (Promega, A712E) (0.2M 
NaOH and 1% SDS) was added and mixed by inverting the tube 4 times followed by 
the addition of 200jil of Neutralization Solution (Promega, A713J) (1.32M potassium 

25 acetate) arid mixing by inverting the tube 4 times. The ly sate was centrifuged at 

10,000 x g in a microcentrifuge for 5 minutes. If a pellet has not formed by the end 
of the centrifugation, centrifugation was carried out for an additional 15 minutes. 

The plasmid preparation was purified using a vacuum manifold by using 
ProMega Plus Minipreps (Promega, A721C) which can be easily processed 
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simultaneously with Promega' s Vac-Man™ or Vac-Man™ Jr. Laboratory Vacuum 
Manifold. One ProMega Plus Minipreps column was prepared for each miniprep. The 
ProMega Plus Minipreps resin (Promega, A767C) was thoroughly mixed before 
removing an aliquot. If crystals or aggregates were present, they were dissolved by 
5 warming the resin to 25-37°C for 10 minutes, then cooling to 30°C before use. One 
ml of the resuspended resin was pipetted into each barrel of the Minicolumn/Syringe 
assembly (/.e., the assembly formed by attaching the Syringe Barrels to the Luer-Lok® 
extension of each Minicolumn), and all of the cleared lysate from each plasmid 
preparation was transferred to the barrel of the Minicolumn/Syringe assembly 

10 containing the resin. A vacuum was applied to pull the resin/lysate mix into the 
Minicolumn until all of the sample has completely passed though the column. 
Extended incubation of the resin and lysate was not necessary since at the 
concentration of plasmid present in most lysates, plasmid binding to the resin was 
immediate. 320 ml of 95% ethanol was added to the Column Wash solution bottle to 

1 5 yield a Column Wash Solution (Promega, A8 1 0E) having a final concentration of 80 
mM Potassium acetate, 8.3 mM Tris-HCl (pH 7.5), 40nM EDTA and 55% ethanol. 2 
ml of the Column Wash Solution was added to the Syringe Barrel and the vacuum 
reapplied to draw the solution though the Minicolumn. The resin was dried by 
continuing to draw a vacuum for a maximum of 30 seconds after the solution has been 

20 pulled though the column. The Minicolumn was transferred to a 1 .5ml 
microcentrifuge tube and the Minicolumn centrifuged at 10,000 x g in a 
microcentrifuge for 2 minutes to remove any residual Column Wash Solution. The 
Minicolumn was transferred to a new microcentrifuge tube, and eluted with 50 jil 
. water or TE buffer (elution in TE buffer was not used if the DNA was subsequently 

25 used in an enzyme reaction, particularly if the DNA was dilute and if larg£_ volumes of 
the DNA solution were required to be added to a reaction, since EDTA may inhibit 
some enzymes by chelating magnesium required as a co-factor for activity). DNA was 
eluted by centrifuging the tube at 10,000 x g in a microcentrifuge for 20 seconds. For 
large plasmids (>10kb), water or TE buffer preheated to 65-70°C was used since it 
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may increase yields. For plasmids >20kb, water or TE buffer preheated to 80 9 C was 
used. Plasmids were eluted as soon as possible off the column since, although DNA 
remains intact on the Minicolumn for up to 30 minutes. The eluted plasmid DNA was 
stored in the microcentrifuge tube at 4°C or -20°C (50 pi). Typically, the yield was 
0.5-2 pg from a 2 ml culture. For plasmids that underwent additional investigation, 
the process was scaled up, and they were purified using Qiagen's preparation methods. 

b. Sequencing and Analysis 

Sequencing was performed by Lark Technologies (Houston TX) as described 
below. Standard direction for the use of DyeDeoxy termination reaction kit (PE/ABI) 
Base specific fluorescent dyes were used as labels. Some reaction mixtures were 
modified to contain DMSO to decrease the formation of secondary structure and allow 
the determination of sequence in regions of high G-C base content. Sequencing 
reactions were" analyzed on 4.75% PAGE by an ABI 373-S using, ABI Sequencing 
Analysis Software version 2.1.2. Data analysis was performed using Sequencer™ 2.1 
software (GeneCodes). 

A 2142 bp open reading frame (Figure 9A) (SEQ ID NO:7) was detected. 
Within the candidate open reading frame, GUG (the 186th codon from the start of the 
open reading frame) was identified as the actual initiation codon from the sequences of 
the amino-terminus since it was immediately followed by the previously determined 
amino-terminal sequence shown in Figure 8, and since GUG is occasionally used as an 
initiation codon in T. thermophilics. Methionine amino-peptidases would be expected 
to cleave off the terminal methionine (Sherman et aL 9 Bioessays 3:27-31 [1985]), 
revealing serine as determined experimentally. The initiating GTG could only be 
conclusively , identified by use of the amino-terminal sequence of protein purified from 
T. thermophilics. The preceding sequences in the open reading frame are italicized in 
Figure 9 A. Also in bold and underlined is the AAAAAA sequence that, by analogy to 
E. coli, is probably the frameshifting site that permits synthesis of the shorter y 
product. The potential stop codons for y in the +1 (first) and -1 reading frame are 
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underlined. The stop codon for the full length x translation product is double- 
underlined. 

Alignment of the resulting DNA sequence with the identified E. coli dnaX gene 
and putative homologous genes from other eubacteria revealed a high level of identity 
5 in the amino-terminal region of all bacteria, confirming the identity of the isolated 
gene to be a dnaX homolog (Figure 9C). To provide the most useful and concise 
presentation, only examples from widely divergent organisms are presented in Figure 
9C. The amino acid sequences for the organisms listed from the top are: Tth (T. 
thermophilus Chloroflexaceae/Deinococcaceae group) (SEQ ID NO:9); E.coli 

10 (proteobacteria group, gamma division) (SEQ ID NO: 10), B.sub. (B. subtilis, firmicute 
group, low G+C gram-positive bacteria division) (SEQ ID NO: 11), Mycopl 
(Mycoplasma pneumoniae, firmicute group, mycoplasma division) (SEQ ID NO: 12), 
Caulo (Caulobacter crescentus, proteobacteria group, alpha division) (SEQ ID NO: 13), 
Syn.sp. ( Synechocystis sp., Cyanobacteria group, (blue green algae) (SEQ ID NO: 14)). 

15 Sequences shaded in black are identical among the indicated bacteria; sequences 
shaded in gray are similar. The consensus sequence is shown in SEQ ID NO: 15. 

The deduced amino acid sequence with the amino terminal methionine removed 
as indicated from amino-terminal protein sequencing (SEQ ID NO: 8) of the T. 
thermophilus x subunit is shown in Figure 9B. The peptide sequences directly 

20 determined from the isolated X. thermophilus DnaX-proteins are underlined. A perfect 
match was observed with the internal peptide sequence obtained from sequencing of 
the 63 kDa candidate y subunit between residues 428 to 450, further confirming the 
isolation of the structural gene for the protein associated with T. thermophilus pol III. 
The molecular weight of the predicted T, thermophilus x subunit is 58 kDa, in 

25 reasonable agreement with the 63±6 kDa determined from SDS PAGE. 

In E. coli, the shorter y product of the dnaX gene is produced by a 
translational frameshifting mechanism at the sequence A AAA AAG containing two 
adjacent lysine codons read by the Lys UUU anticodon tRNA. A potential frameshift 
site that exploits adjacent AAA lys codons flanked by A residues(A AAA AAA A) 
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- was observed at codons 451 and 452 (Figure 9A). Frameshifting into the -1 reading 
frame at this site would result in a 51 kDa protein, close to the 50 kDa candidate 
observed in Fraction V (Figure 9D). if, instead, a frameshift occurred, a 50 kDa 
protein would be produced, indistinguishable within experimental error from the -1 
frameshift alternative. 

Every region conserved among other eubacterial dnaX genes is found 
represented in T. thermophilus dnaX. These sequences include the Walker-type ATP 
binding site, represented by GVGKTTT (SEQ ID NO: 105) in T thermophilus dnaX 
and highly conserved EIDAAS (SEQ ID NO: 106), and FNALLKTLEEP sequences 
(SEQ ID NO:107)(Figure 9C), The conservation beyond the first 220 residues falls 
off for all dnaX genes. Thus, the internal peptide sequence obtained that starts at 
residue 427 (SEQ ID NO: 108; the second underlined sequence in Figure 9B) was 
useful in confirming that the sequence was in the correct, reading frame and in 
providing further confirmation of the isolation of the structural gene for the same 
proteins isolated in association with T. thermophilus pol III. 

EXAMPLE 5 

Cloning and Sequencing the T. Thermophilus dnaE Gene 

The dnaE gene sequence of the T. thermophilus DNA polymerase III was 
obtained using a PCR amplification approach. In. this approach, oligonucleotide 
primers that could be used for amplifying genomic DNA encoding a portion of the a 
subunit of Thermus thermophilus were designed using one of two strategies. The first 
strategy relied on designing PCR primers which hybridized to DNA polymerase III 
sequences that were homologous among several species of bacteria. However, 
insufficient homology was observed between diverse bacterial dnaEs, rendering this 
approach problematic. 

The second strategy relied on designing PCR primers based on the amino acid 
sequence information which was obtained by sequencing T. thermophilus DNA III 
polymerase a subunit which had been partially purified on protein SDS-PAGE gel (as 
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described in Example 3, supra). The designed PGR primers were used to isolate T. 
thermophilics genomic sequences which encoded T. thermophilus DNA III polymerase 
subunit a. First, the PCR primers were used to amplify T. thermophilus genomic 
DNA, and the PCR-amplified sequences were used to create clones which contained 
5 dnaE sequences. Second, these clones were used as probes against restriction nuclease 
digested T. thermophilus DNA in a Southern reaction that would allow isolation and 
cloning of larger segments of genomic DNA of a size expected to contain the entire 
coding region of dndE. Third, downstream sequences were isolated by asymmetric 
PCR. Fourth, downstream sequences were used to probe X libraries of T. thermophilus 

10 chromosomal DNA to isolate full length dnaE clones. 

This Example involved (A) Partial amino acid sequencing, (B) Isolation of T. 
thermophilus dnaE probe by PCR amplification of a segment of T. thermophilus dnaE, 
(C) Cloning PCR-amplified dnaE probe, (D) Southern analysis of T. thermophilus 
DNA using isolated 1F-91R dnaE probe, (E) Cloning and sequencing of the 5' 

15 approximately 1 100 bp of T. thermophilus dnaE gene, (F) Isolation of downstream 
sequences by asymmetric PCR, (G) Southern analysis of T. thermophilus using 
approximately 300 BspHI/EcoRI dnaE probe isolated from plasmid pB5, (H) 
Sequencing full-length T. thermophilus dnaE. 

As described below, a segment of the T. thermophilus DnaE gene was isolated 

20 by PCR and confirmed by DNA sequencing (Figure 13B shows the estimated sequence 
for the first 366 amino acids T. thermophilus DnaE (SEQ ID NO:59). The full-length 
dnaE gene was isolated by using the sequences identified during the development of 
the present invention to probe lambda T. thermophilus chromosomal DNA libraries, 
subcloning hybridizing sequences, and confirming their identity by DNA sequencing. 

25 This full-length sequence is shown in Figure 17A. In this Figure, the dnaE reading 

frame is shown in bold and underlined. The deduced amino acid sequence is shown in 
Figure 17B. In this Figure, sequence corresponding to the peptides sequenced from 
isolated native T. thermophilus DnaE are shown in bold. Alignment with other 
eubacterial dnaE genes confirmed its identity, as the sequences were strongly 
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homologous in regions conserved between eubacterial dnaEs. Figure 17C provides 
alignments with three representative sequences of E. coli, B subtilis type 1 , and 
Borrelia burgdorferi DnaE. 

The isolated and sequenced full length dnaE gene is engineered to overexpress 
wild type and N- and C-terminal fusions with a peptide containing a hexahistidine 
sequence, as well as a biotinylation site. The N- and/or C-terminal fusion proteins are 
purified and used to. obtain T. thermophilus DnaE-specific monoclonal antibodies and 
fusion proteins. 

The fusion proteins are also used to make affinity columns for the isolation of 
proteins that bind to DnaE alone, and in a complex with the isolated x protein, and/pr 
associated factors of the T. thermophilus DnaX complex. The structural genes for 
novel proteins identified by this method are isolated, sequenced, expressed, purified 
and used to determine whether they make contributions to the functional activity of the 
T. thermophilus DNA polymerase III holoehzyme. Wild type DnaE is purified and 
used for the reconstitution of DNA polymerase III holoenzyme. In addition, it is 
developed as an additive to improve the fidelity of thermophilic polymerases that do 
not contain proofreading exonucleases. 

A. Partial Amino Acid Sequencing 

The amino acid sequence at the N-terminus and internal amino acid sequences 
was determined as follows. 

1. N-Terminus 

An aliquot of Fraction I V containing 500 ug of protein was precipitated by the 
addition of an equal volume of saturated ammonium sulfate at 4 °C, centrjfuged at - 
15,000 rpm at 4°C in a SS-34 rotor, resuspended in 50 uj Buffer U and dialyzed 
overnight versus Buffer U at room temperature. The sample was applied to a 5% SDS 
polyacrylamide tube gel (0.5 cm diameter) and subjected to electrophoresis in an 230A 
High Performance Electrophoretic Chromatography module (ABS, Inc.). Fractions (25 
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^il) were collected, aliquots (5 |il), subjected to 7.5% slab SDS-PAGE, and the 
proteins visualized by silver staining. Fractions containing the highest concentration of 
130 kDa protein were pooled, concentrated approximately 30-fold on a Centricon-10 
membrane (Amicon) and analyzed by preparative gel electrophoresis on 7.5% SDS- 
5 polyacrylamide gels. The gel-fractionated proteins were transferred to a Hyperbond 
PVDF membrane (Biorad), and the 1 30 kd protein band excised and subjected to N- 
terminal amino acid sequence analysis in an 477A Protein Sequencer according to the 
manufacturer's (ABS, Inc.) instructions. A sequence of 13 amino-acids was obtained: 
RKLRFAHLHQHTQ (SEQ ID NO: 109). This sequence was aligned with 6/13 
10 residues from the amino terminus M tuberculosis and H. influenzae in a BLAST 

search, and the alignments are shown in Figure 11. In this Figure, "M. tuber" refers to 
M tuberculosis, while "Tth n refers to T. thermophilus, and "H. infl." refers to "K 
influenzae." The numbers in this Figure indicate the amino acid numbers for the 
respective residues. 

15 

2* Internal Amino Acid Sequence 

Internal peptide sequences were determined at the Harvard Microchemistry 
facility as described above. An aliquot of Fraction V (same as that used in the DnaX 
Examples) (230 ^g) was subjected to electrophoresis on a 10% polyacrylamide gel, 

20 and the fractionated proteins transferred to a PVDF membrane (BioRad) as described 
above, the Ponceau- S stained 130 kDa band was cut out, subjected to digestion by 
endo Lys-C, and the resultant peptide separated by HPLC. Three peptides were 
chosen for sequencing (Figure 11). The peptide sequences aligned with sequences of 
E. coli dnaE. The peptides were named by the first amino acid in coli that 

25 corresponded to the first residue of the T. thermophilus peptide. The peptide 

sequences obtained were 64% identical (peptide #91)(SEQ ID NOS:31 and 32); 68% 
identical (peptide #676)(SEQ ID NOS:34 and 35); and 54% identical (peptide 
#853)(SEQ ID NOS:37 and. 38) through the indicated matching segments. Once the 
complete dnaE sequence became available, it was apparent that the peptide 853 
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alignment was not valid. One additional peptide sequence was obtained 
(ETTPEDP AL AMTDHGNLF ; SEQ ID NO:208), that was not used as a PCR probe 
because of the lack of a reliable alignment with eubacterial dnaE sequences. However, 
the sequence was useful in verifying the final dnaE DNA sequence. Sequences of 
5 homology between these peptides are shown in SEQ ID NOS:33, 36, and 39, 
respectively. 

B. Isolation Of T. thermophilus dnaE Probe By PCR And 
PCR Amplification of X thermophilus dnaE 

10 Primers were selected from the amino-terminal protein sequence that would 

provide the least degeneracy. Because the order of the sequences was known by their 
alignment with other eubacterial dnaE sequences, only one primer was made for the 
extreme amino- and carboxyl sequences, IF and 853R, respectively. Both forward and 
reverse primers were made for the internal sequences to enable performing PCR 

15 reactions with all possible primer combinations. The peptide sequences selected and 
the corresponding oligonucleotide probes are summarized in Table 2. 
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Table 2. PCR primers used for isolation of T. thermophilus dnaE probes 



Peptide # 


Sequenced used . 
; for PCR primer 


Primer 
number! ' 


Primer Sequence(s) 
5'->3' 


■ Primer 
Degeneracy 

• (-fold);: 


.2 


HLHQHTQ 
(SEQ ID 
NO:191) 


IF 


CAYYTNCAYCARCAYACN 
CA (SEQ ID NO: 110) 


512 


91 


EGFYEK 
(SEQ ID 
NO: 192) 


91F 
91R 


GARGGNTTYTAYGARAA 
(SEQ ID NO:lll) 

TTYTCRTARAANCCYTC 
(SEQ ID NO: 11 2) 


64 
64 


676 


YQEQQMQ 
(SEQ ID 
NO: 193) 


676F 
676R 


TAYCARGARCARCARATG 
CA (SEQ ID NO: 11 3) 

TGCATYTGYTCYTGRTA 
(SEQ ID NO: 114) 


320 
16 


853 


DGGYFH 
(SEQ ID 
NO: 194) 


853R 


TGRAARTANCCNCCRTC 
(SEQ ID NO: 115) 


128 


' F, forward 


primer; R,' reverse primer 



For PCR reactions, the Boehringer Mannheim Expand ™ High Fidelity PCR 
10 system was used with T. thermophilus chromosomal DNA preparation A following the 
manufacturer's recommendations except as noted. Buffer 1 contained 2 \i\ of 10 mM 
dNTP mix (Gibco/BRL; 200 jaM final); 4 \A of the forward and reverse primers (3 \*M 
initially; 240 nM final in 100 \i\ reaction); 1 \i\ of a 1/30 dilution of T. thermophilus 
genomic DNA (20 ng total), 39>tl distilled H 2 0. Buffer 2 contained 10 \x\ of 10-fold 
15 concentrated Boehringer Mannheim Expand HF Buffer with 15 mM MgCT^, 39 \i\ 
H^O), and 1 \x\ Expand High Fidelity enzyme mix (mixture of Taq and Pwo DNA 
Polymerase — 3.5 units total). Buffers 1 and 2 were combined in a 0.2 ml thin walled 
tube (Intermountain) and the PCR reaction was conducted in an MJ Research thermal 
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cycler as follows: annealing steps were conducted at 48°C, elongation at 68°C and the 
melting step at 94°C, 26 total cycles were run. Reactions were initiated by placing the 
reaction mixture in a block pre- warmed to 95°C, incubated for 5 min. and the 
following cycle was initiated: (a) reactions were incubated at 94°C for 30 s (melting 
5 step); (b) primer annealing was permitted to occur for 45 s at 48°C; (c) primers were 
elongated at 68°C for 2 min. After the conclusion of step (c), the reaction block was 
cycled to 94°C and the cycle was restarted at step (a). After ten cycles, step (c) was 
increased 20 s for each successive cycle until it reached 7 min.; 26 cycles total were 
run. The block was then cycled to 4°C and held there until the sample was removed 
1 0 for further use. 

DNA was precipitated by the addition of 1 pi glycogen (Boehringer Mannheim) 
and 100 pi isopropanol, centrifuged in a microfuge for 2 min. at 14,000 x g; the 
supernatant was discarded and the pellet was washed with 300 pi 75% ethanol (room 
temperature). The pellet was resuspended in 1 0 jal TE buffer and 2 pi of gel loading 
15 dye (0.25% Brorhophenol Blue, 0.25% Xylene cyanol, 25% Ficoll (Type 400) in 

distilled H 2 0) was added. The entire sample was loaded onto a 3% Metaphor agarose 
gel (FMC) in TAE buffer (0.04 M Tris-acetate, 1 mM EDTA (pH 8.5) along with a 
123 bp standard and a 100 bp ladder (Gibco/BRL) in separate lanes. The gel was run 
at 30 V overnight, stained with 1/10,000 dilution of Sybr Green I (Molecular Probes) 
20 in TAE buffer for lh at room temperature with gentle shaking, and photographed 
under short uv irradiation. The results of the analysis are summarized below. 
A. Primer pair: 1F-91R. Size of band(s) 330 near 314 expected 

band. Two other faint bands were visible at approximately 250 . 
and 980 bp. 

25 - B. Primer pair: 91F-676R. Blank lane on gel — no band produced. 

C. Primer pair: 676F-853R. Size of band 520-550 near 522 expected 

band. 1 1 bands ranging from 200 to > 1500 bp. 9 bands were of equal 
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or greater than the intensity of the desired band. (This band did not 
. result in a clone that provided dnaE sequence information.) 

D. Primer pair: 1F-676R. Size of band 2100 near 2050 expected 
band. Only one other band was visible near 1200 bp. (After 

5 cloning, this band was sequenced and found to be incorrect — 

that is, not dnaE gene.) 

E. Primer pair: 91F-853R. Blank lane — no bands visible. 

C. Cloning PCR- Amplified dnaE Probe 

10 The products resulting from primer pairs 1F-91R and 1F-676R were extracted 

using the freeze squeeze technique (described in Example 4, section D) and cloned into 
vector pCRII (Invitrogen) as described above. " 

Plasmid isolates were prepared using the Promega Plus Minipreps technique 
and examined by restriction analysis to determine whether they yielded an EcoRl 

15 fragment approximately the same size as the desired cloned PCR product (EcoRl is 
flanked by the cloning site). From 8 colonies picked from the 1F-91R cloning, 7 
yielded a fragment around the expected 320 bp. From 8 colonies picked from the IF- 
676R cloning, 8 yielded a fragment around the expected size. Plasmid DNA was 
prepared from the 1F-91R and 1F-676R clonings, using the Qiagen Plasmid Midi Kit 

20 and sequenced (Colorado State University. DNA Sequencing Facility, Ft. Collins, CO). 
The sequence of the PCR product (SEQ ID NO:40) resulting . from 
amplification of T. thermophilus chromosomal DNA with primers IF and 91 R 
(plasmid pMGC/El-91.B; DMSO 1322) are shown in Figure 12 A. In this Figure, the 
sequences resulting from the primers underlined and shown in bold (SEQ ID NOS:41 

25 and 42). In Figure 12, a portion of the BLAST search results using only the sequence 
between the primers is shown. The first segment aligned (base 2-55) in the T. 
thermophilus query sequence) aligned with M tuberculosis and Synechocystis sp. DnaE 
in the +2 reading frame; the remainder of the alignment is in the +3 reading frame 
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indicating a frameshift error in the DNA sequence. The percent identity for three 
homology stretches was found to be 50%, 63%, and 46% (M. tuberculosis) and 57%, 
47%, and 47% {Synechocystis sp). 

5 D. Southern Analysis Of T. thermophilus DNA Using 
Isolated 1F-91R dnaE Probe 

A Southern analysis of T. thermophilus chromosomal DNA was conducted as 
described above (Example 4, section F) to determine the restriction fragments most 
likely to contain a full-length T. thermophilus dnaE gene. T. thermophilus 
10 chromosomal DNA was digested with the indicated restriction enzymes alone and in 
combination as described above. Approximately 3 u.g DNA. was subjected to 
electrophoresis, transfer to membranes and blotting as previously described (Ausubel et 
al. [1995] supra). Probe preparation and amount used was as described above. 

The size of the bands which hybridized with the probe in restriction enzyme- 
1 5 digested DNA was as follows: 

Pstl 10.2 kb 

Pstl/Hindlll 0.95 kb ■ ' 

Hind 111 0.95 kb 

BspHl 8.5 kb 

20 ■ - " ...... 

E. Cloning And Sequencing Of The 5' Approximately 1100 bp 
of T. thermophilus dnaE Gene 

A BspHl digest of T. thermophilus chromosomal DNA (preparation B) was 
subjected to electrophoretic separation on an agarose gel and the region corresponding 
25 to the 8.5 kb fragment that hybridized with the dnaE 1F-91R probe cut out of plasmid 

': pei-9ib. r"'.-'.../.'".:..'.. ■ • ,\: 

The isolated 8.5 kb population of BspHl fragments was cloned into pMGC707 
as described above, except that the cleavage site was not within the polylinker. The 
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j BspHl site is at nucleotide 3088 in the parental pCRII vector, approximately 1 120 
nucleotides away from the poly linker. 

Colonies were screened by colony hybridization (Ausubel et al. [1995] supra, 
sections 6.1.1 - 6.1.3 and 6.3.1 - 6.3.4; See also, production of T. thermophilus 1F-91R 
5 probe isolated from plasmid pEl-91B. A total of approximately 4,000 colonies from 2 
plates was screened. Two positive colonies were selected and restruck on LB plates 
containing 150 jag/ml carbenicillin to ensure purity of the selected clones. Two 
separate colonies were selected from each for further examination. Of these, only two 
(DMSO 708B and 709A) had approximately 8 Kb inserts as judged by Hindlll 

10 digestion of plasmids isolated by the Promega Plus Minipreps. Plasmid pBspHIdnaE 
(DMSO 708B) was sequenced (Colorado State University DNA sequencing facility), 
resulting in the approximate sequence of the first 1,109 bases of the T. thermophilus 
dnaE gene (SEQ ID NO:58) (Figure 13 A). In this Figure, the apparent start of this 
preliminary sequence is underlined; the upstream sequence is not shown in this Figure. 

15 The start of this sequence was estimated by alignment with the N-terminal protein 
sequence. Apparently the initiating Met and the following Gly were removed by 
proteolysis. 

The BLAST search using this sequence indicated that there were at least two 
frameshift errors in the sequence, since the first part aligned in frame 1, then a long 

20 stretch aligned in frame 2, and then a final stretch aligned in frame 1 . From these 

alignments, it was determined that the first frameshift error probably occurred between ^ 
bases 102 and 119, and the second occurred between bases 787 and 832. The 
sequences were edited in order to bring all of the homology into one open reading 
frame, by deleting C l08 , that was a part of a string of 6 Gs. An "N" (i.e., any base) 

25 was added after base 787, to produce the sequence shown in Figure 13B (SEQ ID 
NO:59). The best alignments resulting from a BLAST search with the amino acid 
sequence shown in Figure 13B are shown in Figure 13C (SEQ ID NOS:60-62. In this 
Figure, the grey boxes indicate bases that are similar, but not identical between the 
sequences, while black boxes indicate identical bases. 
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F. Isolation Of Downstream Sequences By Asymmetric PCR 

Experiments attempting to use primer E853R in combination with primers 91 F 
and 676F failed to isolate downstream dnaE sequences. Thus, in an attempt to obtain 
further downstream sequence the following asymmetric PCR procedure was developed 
5 so that only relied on an upstream primer and a knowledge of a downstream Pst I 
restriction site through Southern analysis. 

Asymmetric PCR uses one primer to extend a stretch of DNA rather than two 
primers normally used. The advantage of this technique is useful in that it allows 
amplification of either a region upstream or downstream of the primer and subsequent 

10 cloning without further knowledge of DNA sequence. A single DNA primer is mixed 
with genomic DNA, and the polymerase used to make a single strand copy of the 
template DNA as far as it could extend. The single-strand DNA was then made 
double-stranded by a random hexamer-primer annealing and extension with Klenow 
polymerase, and the double stranded DNA was cloned directly into the vector after 

15 creating the appropriate ends by digestion of the double-stranded DNA with the chosen 
restriction enzymes. Although the reaction tends to be error-prone, it was thought that 
it might be useful in providing preliminary sequence information that would aid in the 
isolation of the full-length gene. 

20 1. Primer Design 

The primer was selected from known T. thermophilus dnaE sequence; and 
additionally was biotinylated at the 5' end and contained a Pad restriction site to 
allow cloning. The primer sequence was 5'-biotin- 

CCGCGCTTAATTA A CCCAGTTCTCCCTCCTGGACG- 3 - (SEQ ID NO:l 16). the 
25 ^acl/restriction site, highlighted in bold, has a rare 8-base recognition sequence {i.e., 
only contains As and Ts), and would be expected to be extremely rare in Thermits 
thermophilus with its highly (69%) GC rich genome. This is important because it 
restricts possible clones into the Pad site of the vector to DNA containing the primer. 
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The GCGC region preceding the Pad site clamps the DNA, thus allowing a highly 
efficient cleavage of the Pad site, and Pad is known to cleave efficiently with as little 
as 2 base pairs of DNA flanking the cleavage site. Biotin on the 5' end was used to 
purify the amplified DNA containing the primer, as explained below. 

2. PCR Amplification 

For the single-stranded amplification step, the Boehringer Mannheim Expand™ 
PCR system was used and following the manufacturer's recommendations except as 
noted in the following. Buffer A contained 3.5 jxl of 10 mM dNTP mix (Gibco/BRL; 
350 fiM final); 0.5 [il of the forward primer (80 j^M stock concentration; 400 nM final 
in 100 |il reaction), 46 jal distilled H 2 0). Buffer B contained 10 jal of 10-fold 
concentrated Boehringer Mannheim Long Template PCR System Buffer 2 (500 mM 
Tris-HCl (pH 9.2 at 25°C), 160 mM ammonium sulfate, 22.5 mM MgCl 2 ), 37.5 ^1 
H 2 0), and 1.5 nl Expand Long Template enzyme mix (mixture of Taq and Pwo DNA 
Polymerase — 5 units total) and 1 ^il of 7*. thermophilus genomic DNA (preparation A, 
0.6 \ig total). Buffers A and B were combined in a 0.2 ml thin walled tube 
(Intermountain) and the single primer extension reaction was conducted in an MJ 
Research thermal cycler as follows. 

Annealing steps were conducted at 55°C, elongation at 68°C and the melting 
step at 95°C; 60 totaf cycles were run. Reactions were initiated by placing the reaction 
mixture in a block pre-warmed to 95°C, incubated for 5 min. and the following cycle 
was initiated: (a) reactions were incubated at 95°C for 30 s (melting step); (b) primer 
annealing was permitted to occur for 30 s at 55°C; (c) primers were elongated at 68°C 
for 12 min.. After the conclusion of step (c), the reaction block was cycled to 94°C 
and the cycle was restarted at step (a). Upon completion of 60 cycles, the~block was 
cycled to 4°C and held there until the sample was removed for workup. 

After completion of the PCR reaction, the Boehringer Mannheim PCR cleanup 
kit was used to remove unextended primer. The PCR cleanup kit efficiently binds 
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DNA that contains greater than 100 bp of double stranded DNA. Most long single- 
stranded DNA would be expected to contain enough secondary structure to bind to the 
silica beads, while the primers would not. The PCR cleanup kit was used as described 
above. Bound DNA was eluted overnight in 50 ul TE buffer, pH 8.4 followed by an 
additional wash with 50 p.1 of distilled H 2 0. ' 

The extended, biotinylated DNA was removed from the remaining genomic 
DNA by binding the biotinylated DNA to agarose beads coated with monomeric avidin 
(Soft-Link, Promega). Soft-Link resin (100 ul settled beads) was incubated 2x w/0.4 
mis 10 mM biotin washed 6x with 10% acetic acid by resuspending gel in 1ml and 
spinning dry in 0.45 uM spin filters (Life Sciences). The resin was then washed once 
with 0.5 mis of 0:5 M potassium phosphate (pH 7.0), 3x with 50 mM potassium . 
phosphate (pH 7.0), then resuspended in 0.5 mis 50 mM potassium phosphate (pH 
7.0). DNA from the single primer elongation reaction was added to 20 |ii of the beads 
(packed volume) (final reaction volume 120.^1), incubated 30 min. with gentle 
agitation every few minutes, washed with 0.5 ml TE buffer (pH 7.5) followed by 
centrifugation (14,000 x g, 30 s) in the spin filters to remove unbiotinylated DNA. 
The wash was repeated 5 times. The biotinylated DNA was eluted by the addition of 
40 ul 10 mM biotin to the beads and incubation in a capped spin filter for 5 min. in a 
42°C water bath. The DNA was recovered by centrifugation (14,000 x g, 320 s); the 
elution step was repeated once more. To ensure complete removal of DNA, the beads 
were washed with 0.1 M NaOH,. spun as before and the basic mixture neutralized by 
the addition of 2 M Tris-HCl. The neutralized solution had a final pH of 7.8 as tested 
by pH paper. The DNA resulting from the alkaline-eluted and biotin-eluted fractions 
was ethanol by the addition of 1 ul glycogen, 1/10 volume sodium acetate (adjusted by 
25 the addition of acetin acid to pH 5.2), and 3 volumes ethanol followed by 

centrifugation at 14,000 x g for 10 min. The samples were suspended in 50 ul 
distilled H z O and combined. 

To convert the single stranded DNA to double-stranded DNA the random 
hexamer priming was used as previously described, following the protocol of the 
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manufacturer (Boehringer-Mannheim). A 5x reaction mixture was used (100 jal 
reaction volume) with the following components: approximately 40 jil DNA, 3 |il 
each dATP, dTTP, dGTP and 5 dCTP, 10 \xl Boehringer Mannheim random primed 
DNA labeling kit component 6 "reaction mixture" (containing random hexamer 
5 oligonucleotides, 25 water, and 5 p.1 Klenow polymerase (Boehringer, 10 U). The 
reaction was incubated for 30 min. at 37°C. 

DNA from the random-hexamer priming was precipitated using 1 jil glycogen, 
1/10 volume sodium acetate, and 3 volumes of ethanol. The pellet was washed 2x 
with 0.2 mis of 75% ethanol and the pellet resuspended in 200 p.1 TE buffer, pH 7.5. 
10 About 15 jag of total DNA was recovered, quantitated with PicoGreen™ as described 
above. 

A Pad digest on 6.5 [ig of the recovered DNA was performed to prepare the 
primer end of the product for cloning. In reaction containing 90 jj.1 DNA (6.5 fag), 
was digested (3h s 37°C) in the presence of 25 jil 10 NEB buffer 1(10 mM Bis Tris 

15 Propane-HCi, 10 mM MgCl 2 , 1 mM DTT (pH 7.0 at 25°C), 10 |il Pad enzyme (NEB, 
100 units), 125 jil distilled H 2 0). The cleaved DNA was precipitated by the addition 
of 1 \il glycogen, 25 fxl sodium acetate, and 3 volumes of ethanol. The product 
recovered by centrifugation (5 min., 14,000 x g ) was dissolved in 75 distilled H 2 0. 
Analysis of Southern digests showed that there was a Pstl site downstream of 

20 the BspHl site that defined the end of the longest T, thermophilus dndE clone. 

Therefore, the asymmetric PCR product was digested with Pstl, to provide a distaL 
cloning site to enable cloning more downstream sequence. 

Pad digested DNA (37 nl containing 3.25 |ig DNA) was combined with 12.5 
^il lOx NEB buffer #3 50 mM Tris HC1, 10 mM MgCl 2 , 100 mM NaCl, 1 mM DTT 

25 (pH 7.9 at 25°C), 5 |il Pstl (NEB, 25 units), 65.5 jal dH 2 0) and digested for 2 h at 
37°C. The" digested DNA was precipitated by the addition of 1 jal glycogen, 25 jil 
sodium acetate and 3 volumes of ethanol, recovered by centrifugation, dissolved in 125 
1^1 dH 2 0, and quantitated with PicoGreen™ as described. The concentration was 9 
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ng/|al (/:<?., 1 125 ng total recovery). DNA was precipitated again as before and 
dissolved in 2 jal dH 2 0. 

The vector was digested with Pad and BstXl (gives a Pstl compatible end for 
cloning) as follows: 10 \x\ DNA (pMGC707, 30 \xg) was digested (4h, 55°C) in the 
5 presence of -25 fil lOx NEB buffer #3, 10 ^l BstXI (NEB, 100 units), 205 \i\ distilled 
H 2 0). The product was precipitated by the addition of 2 jil glycogen, 1/10 vol. 
sodium acetate (2.5 M, pH 5.2), 2 volumes ethanol, collected by centrifugation, and 
redissolved in 50 ^1 TE buffer (pH 7.5). The redissolved DNA (15 fig) was further 
digested (37°C overnight) with Pad (5 |nl Pad 50 units, NEB) in the presence of 10 \x\ 
10. lOx NEB buffer #1 and 60 jil distilled H 2 0. 

DNA was run preparatively in 0.6% Seakem agarose in lx TAE at 85V for 4 
hrs. A single band was obtained and recovered by the freeze/squeeze technique 
described above. DNA was precipitated with ethanol by the standard procedure 
described above. ^ •• -V .' " 

3. Ligation of PacI/Pstl-digested Asymmetric PCR Product 
Into PacI/BstXI-digested pMGC707 DNA to Make pB5 

Digested pMGC707 vector DNA (2 jil (0.1 ^g) was mixed with 2 \x\ of 
Pad/ Pstl digested asymmetric PCR product (1 ^ig) together with 2 \i\ lOx ligation 

20 buffer (Invitrogen), 1 ^l T4 DNA ligase (Invitrogen, 4 Weiss units) and 13 jj.1 distilled 
H 2 0 and incubated 1 h at 23°G and then overnight at 14°C. DNA was precipitated 
with ethanol, centrifuged and resuspended in 10 \x\ distilled H 2 0. 4 \x\ of this mixture 
was transformed by electroporating as described above. To enrich the population for 
kanamycin-resistant bacteria, the transformation mixture was grown in liquid culture 

25 for 1 hr without selection and 2.5 hrs in the presence of kanamycin (50 |ig/ml), then 
streaked for isolated colonies on LB. Ten colonies were picked and DNA was 
prepared using Promega Plus minipreps. Two of the ten had inserts of roughly 1,8 kb 
as judged by comparison of Pad -digested DNA with linearized vector. 
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The candidate inserts from isolates pB4 and pB5 also released a 0.95 kb 
fragment upon ■Hindlll digestion as predicted from the Southern blot of T, 
thermophilus chromosomal DNA against a dnaE probe. One (pB5, DMSO 1335) was 
sequenced and shown to have homology to E. coli dnaE, The region of asymmetric 
5 PCR product corresponding to the more N-terminal end (i.e., the "front end" of the 

clone) of the T. thermophilus dnaE gene is shown in Figure 14A (SEQ ID NO:63). In 
this Figure, the region corresponding to the forward primer is underlined and shown in 
bold. The distal end of the PCR clone was also sequenced (/.<?., the "back end" of the 
clone), in order to provide an indication of whether extensive new sequence had been 

1 0 gained, or if the full-length gene was obtained by the asymmetric PCR. The sequence 
of this clone is shown in Figure 14B (SEQ ID NO: 180). The Pstl sequence that 
defined the front end of the T. thermophilus DNA is underlined in Figure 14A. 
BLAST alignment (Figure 14B; SEQ ID NO: 180) indicated that the most distal end 
corresponded to roughly residue 464 of E. coli DnaE, considerably short of the full- 

15 length 1160 amino acid protein. Thus, this approach to identify the T. thermophilus 
dnaE sequence was abandoned. However, this alignment did reveal the critical 
"PDXD" motif (SEQ ID NO:l 17) that defines two critical aspartate residues making 
up part of the eubacterial DNA polymerase III active site. This motif is bolded and 
underlined in Figure 14. T. thermophilus DnaE was found to be 62% identical to E. 

20 coli sequences in this region. 

G. Southern Analysis Of 7*. thermophilus Using Approximately 300 

BspHI/EcoRI dnaE Probe Isolated From Plasmid pB5 AND Sequencing 
Full-Length T. thermophilus dnaE 

25 Sequence analysis of the BspWl T. thermophilus dnaE clone indicated that it 

encoded only the amino-terminal portion of the sought gene. After obtaining a portion 
of the T. thermophilus dnaE gene further downstream sequence by use of the 
asymmetric PCR cloning method, a probe was designed that was more internal to the 
gene, enhancing the chances of detecting a restriction fragment that encoded 
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downstream regions of T. thermophilus cinaE. The distal most C-terminal encoding 
fragment of pB5 bounded by the terminal vector EcoRl site and a BspHl site internal 
to driaE was used as a probe for more Southern blots. T. thermophilus chromosomal 
DNA was digested with the indicated restriction enzymes alone and in combination. 
5 Approximately 3 ug DNA was subjected to electrophoresis, transferred and blotted as 
previously described (Ausubel, supra, section 2.9.2 to 2.9.11). Following 
hybridization procedures described and using 120 ng of a probe labeled by the random 
hexamer priming method (2.5 x 10 7 cpm) we determined the size of the bands which 
hybridized with the probe following restriction digestion as follows (all sizes of 
10 restriction fragments from this and all Southerns is ± 20%): 

BamHl 4.9 kb , 

Bamm/Kpril 4.2 kb. 

Kpnl 7.4 kb 

BgRl . 13.8 kb 

15 BgM/Ncol 2.1 kb 

Ncol 2.1 kb 

BamHVBgRl 4.9 kb 

BamHl/Ncol 2.1 kb 

Pvul, Spel and Xhol digests yielded only very high molecular weight DNA 

20 

A critical part of the restriction map gleaned from the above (with distances 
here and as determined from all Southerns was ±20%) was: 
BamHI— 700 bp— Kpnl— 4200bp—BamHI— 3200 bp— Kpnl 

25 This method was used to isolate the putative full length T. thermophilus dnaE 

clones (Lofstrand Labs) as described briefly below. The bacterial DNA was randomly 
-? cut .with SauAl {i.e., a 4 base cutter) -by partial digestion and ligated into phage vector 
lambda GEM 1 2 at the Xho I site (Promega). The library was packaged using 
Epicenter Technologies packaging extract and plated out for screening. The plaques 

30 were lifted in duplicate and probed using 32 P random primed pB5E insert DNA (780 



-112 - 



BNSDOCID: <WO_9913060A1J_> 



WO 99/13060 PCT/US98/18946 

bp fragment isolated from low melt agarose using Kpnl and BspHl). The six primary 
plates containing greater than 20,000 pfu per plate demonstrated about 100 duplicate 
positives each. Twelve duplicate positive primary plaque areas were picked and 
replated to enrich the clones. A third plating was performed using duplicates from the 
5 second screen and four duplicate positive isolated plaques were chosen for 

amplification and DNA purification. The phage were grown in liquid lysatc cultures 
and the virus was purified by two CsCl density gradient centrifugations. The four 
DNA clones were digested with Kpnl and BamHl, individually, and Kpn\ and itamHI. 
The ethidium bromide stained 0.8% TBE gel indicated that clones BGP2 7.1 and 

10 BGP2 8.1 appear to contain full length genes based on the restriction patterns present. 
A Southern blot was also performed on the four clones cut with ZtamHI.and 
BamHl+Kpnl using the pB5E (probe 2) which showed strong signals in the predicted 
4.9 Kb BamUl fragment and 7.4 Kb Kpnl fragment. The 4.9 Kb BamHl fragment was 
subcloned into plasmid (pBSIIKS+) for sequencing. 

15 The full-length sequence of Tth dnaE (SEQ ID NO: 196) was obtained {See, 

Figure 17A). In this Figure, the dnaE reading frame is shown in bold and underlined. 
The deduced amino acid sequence (SEQ ID NO: 197) is shown in Figure 17B). In this 
Figure, sequence corresponding to the peptides sequenced from isolated native Tth 
dnaE are shown in bold. Alignment with other eubacterial dnaE genes confirmed the 

20 . identity of the full-length dnaE sequence. Figure 17C shows alignments of the Tth 
dnaE gene with (SEQ ID NO: 198) three representative sequences from £. coli (SEQ 
ID NO: 199), B. subtilis type 1 (SEQ ID NO:200), and Borrelia burgdorferi (SEQ ID 
NO:201) dnaE. 
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■ H. Construction of Vectors Expressing Tth a and 

Biotin-Tagged a, and Purification of Expressed Proteins 
1. Construction of Starting Vectors 

First, a vector, "pDRK-C" (described by Kim and McHenry, J : Biol. Chem., 
271:20690-20698 [1996]), containing a pBR322 origin of replication, a gene 
expressing the laqI Q repressor protein, arid a semisynthetic E. coli promoter (pAl), that 
is repressed by the lad repressor, was modified. Plasmid pDRKC DNA was prepared 
and digested with Kpnl, the resulting overhanging ends were removed by treatment 
with T4 DNA polymerase in the presence of Mg ++ and the four dNTPs (0.1 mM), and 
resealed with T4 DNA ligase in the presence of 1 mM ATP. Plasmids were 
transformed into E. coli, and plasmid-containing colonies were selected based on 
ampicillin resistance. Plasmids were prepared from these colonies and screened for 
loss of the Kpnl site. One of the colonies that contained plasmid that was not cleaved 
by Kpnl was selected, grown, and used for preparation of the resulting plasmid, 
15 ,, pDRKC-Kan minus : n Approximately 40 ug of DNA as obtained from an 80 ml culture. 
The resulting plasmid was sequenced in the region of the Kpnl site, and was found to 
contain the expected sequence. 

Plasmid pDRKC-Kpn minus DNA was digested with Xbal arid Spel, to remove a 
small polylinker that contained sites Xbal-Ncol-Notl-DralU-Spel. The following 
20 oligonucleotide (the sequences of each strand are shown below; SEQ ID NOS:202 and 
203) was synthesized and inserted into the digested plasmid to replace the polylinker 
of pDRKC-^pn" 1 "" 15 , with ^I--AGGAGG-/'atI--A^oI--spacer--A:p«I--spacer--F^I-- 
Spel.{i.e., the following oligonucleotides were synthesized separately, and annealed to 
form a duplex with sticky ends and inserted into cut plasmid). 



25 



ATG# P63-S1 (SEQ ID NO:202): 

CTAGAGGAGGTTAATTAACCATGGAAAAAAAAAGGTACCAAAAAAAAAGGC 



CGGCCA 
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ATG# P63-A1 (SEQ ID NO:203): 

TCCTCGAATTAATTGGTACCTTTTTTTTTCCATGGTTTTTTTTTCCGGCCGGTG 
ATC 

Ampicillin-resistant colonies obtained by transformation of the resulting 
plasmid were screened for production of plasmid that had gained a Kpnl site. DNA 
from one plasmid ("pAl-CB-Nco-l") was prepared, and replacement of the original 
poly linker with the above oligonucleotide (SEQ ID NOS:202 and 203), was confirmed 
by DNA sequencing using methods known in the art. 



2. . Construction of a Vector Expressing Wild-Type 
Tth DNA Polymerase III a Subunit 

A 320 bp Ncol/Kpnl fragment, containing the 5" end of the Tth dnaE gene was 
cloned into the corresponding sites of plasmid pAl-CB-Nco-l, to generate plasmid 

15 n pAT-TE(5'). M The resulting clones were screened for the Ncol/Kpnl sites. A plasmid 
DNA preparation was made from one positive clone. Then, a 3454 bp KprillFsel 
fragment (obtained from plasmid pAl-NB-TE, described below), containing the 3* -end 
of the Tth dnaE gene was subcloned into the corresponding sites of plasmid pAl- 
TE(5'). The resulting clones were screened for the KprillFsel sites, and for the correct 

20 size PacllFsel fragments (3.7 kb/5.6 kb), and Hindlll fragments (0.95 kb/9.2 kb), kv 
separate digests. A plasmid DNA preparation from one positive clone yielded 
approximately 125 jag of DNA from an 80 ml culture. The resulting plasmid, "pAl- 
TE," was transformed into E. coli strain MGC 1030 (mcr A, mcrB, lambda-, lexA3, 
uvrD::Tc, OmpT::Kn (available from Enzyco). Individual colonies were selected and 

25 screened for expression of the Tth DNA polymerase III a subunit. 
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3. Verification of Expression of Wild-Type Tth 

DNA Polymerase III a Subunit by pAl-TE/MGC 1030 

Two candidate isolated containing pAl-TE in MGC 1030, were inoculated into 
a 2 ml of 2x YT culture medium (10 g tryptone, 10 g yeast extract, and 5 g NaCl, per 
5 liter), containing 100 ^ig ampicillin, and grown overnight at 30°C, in a shaking 

incubator. After 18-24 incubation, 0.5 ml of the now-turbid culture was inoculated 
into 1.5 ml of fresh 2x YT medium. The cultures were incubated for 1 hour at 37°C, 
with shaking. Expression was induced by addition of IPTG to a final concentration of 
1 mM. After 3 hours, post-induction, cells were harvested by centrifugation. The cell 

10 pellets were immediately resuspended in 1/10 culture volume of 2x Laemelli sample 
buffer (2x solution: 125 mM Tris-HCl (pH 6.8), 20% glycerol, 4% SDS, 5% p- 
mercaptoethanol, and 0.005% bromphenol blue w/v), and sonicated, to shear the DNA. 
The samples were heated for 10 minutes at 90-100°C, and centrifuged to remove 
. insoluble debris. A small aliquot of each supernatant (3.5 jlxI) was loaded onto a 4- 

15 20% SDS-P AGE mini-gel (Novex, EC60255; 1 mm thick, with 15 wells/gel) in 25 

mM in Tris base, 192 mM glycine, and 0.1% SDS. A protein, migrating between the 
p/p' subunits of RNA polymerase, and the high molecular weight standard of the 
Gibco 10 kDa protein ladder (120 kDa), was observed as a faint band in the induced 
cultures, but was not observed in the uninduced control. This protein was determined 

20 to be consistent with the expected molecular weight of 137,048 kDa. The detected 
protein represented less than 0.5% of E. coli protein. 

4. Construction of a Vector That Expresses Tth 
DNA Polymerase III oc Subunit with an N-Terminal 

25 Biotin/His Tag 

Plasmid DRK-N(M), a plasmid designed for expression of proteins with an 
amino-terminal tag containing a peptide that is biotinylated in vivo, a hexahistidine 
site, and thrombin cleavage site (See, Kim and McHenry, J. Biol. Chem., 271:20690- 
20698 [1996]), and a pBR322 origin of replication, a gene expressing the laql° 

- 116 - 



BNSDOCID: <WO 9913060A1_L> 



WO 99/13060 PCT/US98/18946 

repressor protein, and a semisynthetic E. coli promoter (pAl) that is repressed by the 
lad repressor was modified. The following two oligonucleotides were separately 
synthesized, annealed to form a duplex with sticky ends, and inserted into cut plasmid. 
A synthetic linker/adapter consisting of annealed oligonucleotides (SEQ ID NOS:204 
5 and 205, shown below) 

ATG #P64-S1 (SEQ ID NO:204): 

CTAGGAAAAAAAAAGGTACCAAAAAAAAAGGCCGGCCACTAGTG 

10 ATG #P64-A1 (SEQ ID NO:205): 

was prepared and cloned into the AvrWISaH sites of plasmid pDRK-N(M), to convert 
the polylinker following the fusion peptide from Avrll-DralU-Sall to ^vrll—spacer— 
Kpnl— spacer— FselSpelSall. The resulting colonies were screened for introduction 
15 of an Spel site' carried by the linker/adapter. One positive clone ("pAl-NB-Avr-2") 
was selected and confirmed by DNA sequencing across the linker/adapter region. 

A PCR fragment (411 bp) representing the 5' end of the Tth dnaE gene was 
generated using primers (ATG#P69-S561; SEQ ID NO:206) and (ATG#P69-A971; 
SEQ ID NO:207): 

20 

ATG#P69-S561 (SEQ ID NO:206): GAATTCCTAGGCCGCAAACTCCGCTTC 

ATG#P69-A971 (SEQ ID NO:207): GTGCTCGCGCAGGATCTCCCGGTCAATC 

25 The product was cleaved with Avrll and Kpnl, and the resulting 320 bp 

fragment was cloned into the corresponding sites of plasmid pAl-NB-Arv-2. The 
clones were screened for introduction of a 320 bp Avrll/Kpnl fragment. One positive 
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clone ("pAl-NB-TE(5')") was selected, and the sequence across the PCR fragment was 
confirmed by DNA sequencing. 

A 3454 bp Kpnl/Fsel fragment representing the 3 '-end of the Tth dndE gene 
was isolated from the 4.9 kb BamUl fragment cloned into ,, P BSIIKS+" {See, Example 
5, part G), cloned into the corresponding sites of plasmid pAl-TE(5'), and screened 
for the 3454 bp KpnVFsel fragment. Additional confirmation of the plasmid structure 
was obtained by analysis of NdeVSpel, Pstl, and AvrlVFsel digests. Plasmid 
preparation from an 80 ml culture of pAI-NB-TE, yielded approximately 130 ug of 
DNA. The plasmid pAI-NB-TE was transformed into E. coli strain MGC 1030, 
individual colonies were selected, and screened for expression. 

5. Verification of Expression of Protein Containing an 
Ammo-Terminal Biotinylatcd Peptide Tag Fused to 
Tth DNA Polymerase III a Subunit Expressed 
by pAl-NB-TE/MGC 1030 

An overnight culture of three suspected biotinylated Tth a-expressing plasmids 
was inoculated 1:50 info 2x YT culture medium containing 100 ug/ml ampicillin, and 
grown at 30°C, with shaking, until the OD 600 reached approximately 0.6. Protein 
expression was then induced with IPTG, at a final concentration of 1 mM, followed by 
the addition of biotin at a final concentration of 10 uM. The control culture contained 
biotin, but was not induced with IPTG. After 3 hours of induction, the OD^ of the 
culture was determined and the cells were harvested by centrifugation. Cell pellets 
were resuspended in 2x Laemelli sample buffer at 70 ul/OD^, and sonicated to shear 
the DNA. Samples were heated for 10 minutes at 90-100°C, and centrifuged to 
remove cell debris. A 3 ul aliquot (corresponding to 0.0429 OD 600 units) was loaded 
onto an SDS-PAGE gel, electrophoresed, and stained with Coomassie blue. An 
intensely staining band representing at least 3% of total E. coli protein was observed. 
This band, which was not present in the uninduced control, migrated between the p/0' 
subunits of RNA polymerase and the high molecular weight standard of the Gibco 10 
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kDa protein ladder (120 kDa), consistent with the, expected molecular weight (140,897 
Da). The expected sequence of the expressed protein is shown in Figure 17D. 

Biotin blots {See, Example 8, and Kim and McHenry, J. Biol. Chem., supra) 
were also prepared, in order to confirm' the presence of a biotinylated tag on the 
5 expressed protein from pAl-NB-TE. Separated proteins from the SDS-PAGE gel 

described above, were transferred to a nitrocellulose membrane at 30 volts, in 12 mM 
Tris base, 96 mM glycine, 0.01% SDS, and 20% methanol, for 60 minutes at room 
temperature. Each lane of the gel contained 1 jil of supernatant, corresponding to 
0.0143 OD 600 units of the culture material. In all three cultures tested that expressed a 
10 novel protein of the expected molecular weight, a strong biotinylated band was 

detected that was at least as intense as the endogenous E. coli biotin carrier protein 
band at approximately 20 kDa. 

6. Large-Scale Growth of pAl-NB-TE/MGC 1030 
15 Strain MGC 1030 (pAl-NB-TE) was grown in a 250 L fermentor, to produce 

cells for purification of peptide-tagged Tth a. F-medium (1.4% yeast extract, 0.8% 
tryptone, 1.2% K 2 HP0 4 , and 0.12% KH 2 P0 4 ) was sterilized, and ampicillin (100 mg/L) 
and kanamycin (35 mg/L) were added. A large-scale inoculum was initiated from 1 
ml of DMSO stock culture (i.e., culture stored in DMSO) (to 28 L), and grown 
20 overnight at 37°C. The inoculum was transferred (approximately 16.7 L) to the 250 L 
fermentor (starting OD^o of 0.05), containing F-medium with 1% glucose, and 100 
mg/L ampicillin. The culture was incubated at 37°C, with 40 LPM aeration, and 20 
rpm. Expression of biotin-tagged Tth DnaE was induced when the culture reached an 
OD 600 of 0.71. Additional ampicillin (100 mg/L) and biotin (10 |iM) were added at 
25 induction with 1 mM IPTG. Additional ampicillin (100 mg/L) was addecTafter 1 hour 
of induction. Ceil harvest was initiated 3 hours after induction, and the cells were 
chilled to 19°C during harvest. The harvest volume was approximately 182 L, and the 
final harvest weight was approximately 2.35 kg of cell paste. An equal amount (w/w) 
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of 50 mM Tris (pH 7/5) and 10% sucrose solution was added to the cell paste. 
Quality control results showed 5 out of 10 positive colonies on ampicillin-containing 
medium before induction and. 5 out of 10 positive colonies post-induction. Cells were 
frozen by pouring the cell suspension into liquid nitrogen, and stored at -80°C, until 
processed. Figure 17D shows the deduced amino acid sequence (SEQ ID NO:223) of 
Tth DnaE, containing a biotin/hexahis tag on the amino terminus. 

7. Cell Lysis of pAl-NB-TE/MGC 1030 and Determination of 
Optimal Ammonium Sulfate Precipitation Conditions for 
Biotinylation/IIexahis Tagging of Tth or (DnaE) 

Cells (100 g) obtained from the fermentation described above, were subjected 
to a lysis procedure equivalent to that described for cells expressing biotinylated Tth r 
(DnaX), in Example 8. The lysate supernatant (400 ml) was found to contain 9.2 g 
total protein. The supernatant was divided into 6 separate 60 ml aliquots. Based on 
this protein concentration, each aliquot originally contained 1,370 mg total protein. 
Ammonium sulfate sufficient to achieve 35, 40, 45, 50, 60, and 70% saturation at 0°C, 
was added to each of the cultures. After dissolving the ammonium sulfate, each 
aliquot was stored on ice for 30 minutes. Precipitated protein was then recovered by 
centrifugation (12,000 rpm in Sorvall GSA rotor, at 0°C). A separate {i.e., additional) 
aliquot (0.5 ml) was also taken for analysis, and the pellet recovered by centrifugation 
in a microfuge (5 min at 14,000 rpm, at 4°C). Two volumes of saturated ammonium 
sulfate were then added to the recovered supernatants. After storage on ice, the 
precipitated protein pellets were recovered as described above. This procedure 
permitted the estimation of the quantity of biotinylated a subunit in both the 
supernatant and the pellet after the initial precipitation. 

Pellets from each of the aliquots (both pellets and supernatants) were taken and 
dissolved in the original volume with buffer T+25 (containing 50 mM Tris-HCl (pH 
7.5), 20% glycerol, 5 mM DTT, 0.1 mM EDTA, and 25 mM NaCl), and the total 
protein determined using the Pierce's Bradford reagent and method, with BSA used as 
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a standard. Based on this value, 24, 54, 95, 439, 1280, and 1230 mg protein were 
recovered from the 35, 40, 45, 50, 60, and 70% ammonium sulfate pellets, 
respectively. 

Each pellet was then analyzed for the presence of biotinylated Tth DNA 
5 polymerase III a subunit by biotin blot analysis, as described in Example 8, as well as 
by SDS-PAGE analysis and Coomassie Blue staining. For these analyses, 10 fig of 
each aliquot were used and either stained with Coomassie Blue, or transferred to a 
nitrocellulose membrane for biotin blotting. Based on these results, it was apparent 
that significant levels of Tth a precipitated even in the lowest (35% saturation) 

1 0 ammonium sulfate concentration, and that no biotinylated Tth a was detectable in the 
supernatant of the 50% saturated ammonium sulfate supernatant. The ratio of 
biotinylated Tth a total protein was highest in the 35% pellet, and was decreased by 
approximately 3 -fold in the 45% pellet. Precipitation with 45% ammonium sulfate 
was selected as the precipitation condition to provide material for development of 

1 5 additional purification methods. 

8. Purification of N-Terminal Biotin/Hexahis Tagged Tth a 

A 45% saturated ammonium sulfate precipitate obtained from 30 g of cells was 
prepared by the procedure described above. The resulting pellet contained 141 mg 

20 protein. The pellet was dissolved in 10 ml buffer EB (50 mM sodium phosphate (pH 
7.6), 300 mM NaCl, 5 mM 2-mercaptoethanol), and applied to a 1.1 ml Qiagen Ni- 
NTA agarose column equilibrated in buffer EB. The column was washed with 24 ml 
buffer EB containing 1 mM imidazole, and was eluted with a 12-column volume 
gradient, in buffer EB with the imidazole concentration ranging from 1 to 100 mM. 

25 Activity, as measured by the gap filling assay at 30°C, eluted with a peak at 

approximately tube 22 of 26 (each tube contained 0.5 ml) collected. The pooled peak 
(fractions 17-23) contained 2.2 mg protein, and 6.5 x 10 s units of gap-filling activity, 
of the 1.4 x 10 6 gap filling units applied to the column. 
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9. Analysis of Molecular Weight and Purity of 

Purified Biotin/Hexahis Tagged Tth Alpha - 

Samples of the peak fractions obtained from the Ni-NTA column described 
above, was subjected to SDS-PAGE, using two SDS-PAGE gels, along with molecular 
5 weight standards (NEB). One gel was stained with Coomassie Blue, and the proteins 
present in the other gel were transferred to a nitrocellulose membrane for biotin 
blotting. Purified Tth biotin/hexahis tagged a, detected by biotin blot, migrated 3 mm 
farther than a biotinylated 165,000 fusion of the maltose binding protein and p- 
galactosidase, and 7 mm less than biotinylated rabbit muscle phosphorylase, indicating 
10 a molecular weight near 147,000 Da. This value is consistent within the error of the 
procedure with an expected value of approximately 141,000 Da. Tagged Tth a was 
found to migrate more slowly than its E. coli counterpart. 

On Coomassie-stained SDS-PAGE gels, the major band corresponding to both 
- " the band induced with IPTG detectable from the crude extracts, and the biotinylated 
1 5 band from crude and purified protein, represented greater than 30% of the total protein 
on the gel, and was at least 6-fold more intense than the most abundant contaminants. 

10. Determination of Temperature Optimum for Catalytic 
Activity of Purified Tth a Subunit 

20 , ■• . . This section describes the determination of the temperature optimum for DNA 
synthesis on a gapped template by purified Tth N-terminal Biotin/Hexahis tagged 
DnaE (a). 

A gap filling assay on nuclease-activated DNA was performed at varying 
temperatures to determine the temperate optimum for the purified Tth a subunit of 
25 DNA polymerase III. Biotin/hexahis tagged a obtained from Ni++-NTA 

chromatography described in Example 5, Section H, Subpart 8, was diluted 50-fold in 
EDB (enzyme dilution buffer) (50 mM HEPES (pH 7.5), 20% glycerol, 0.02% NP40, 
0.2 mg/ml BSA) on ice and 0.5, 1 and 2 ^1 were pipetted into separate tubes and 
prewarmed for 3 minutes to the specified temperature. A premix of assay solution was 
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made by pipetting (per tube in assay set) 18 (il EDB, 3 ^1 dNTPs (400 |iM dCTP, 
dATP, dGTP, 150 \xM [ 3 H]dTTP (96 cpm/pmol total nucleotide), 1 |al 250 mM MgCl 2 , 
1 |il nuclease activated salmon sperm DNA (5 mg/ml) (Enzyco) to each tube. Then, 
23 (J.1 of the assay solution was prewarmed to the designated temperature and pipetted 
5 into the diluted enzyme solution to initiate the reaction. Incubation was continued for 
an additional five minutes. Upon completion of the assay, tubes were transferred to 
ice and 2 drops of (0.2 M sodium pyrophosphate) and 0.5 ml of 10% TCA were 
added. The resulting suspension was then filtered over GFC filters (Whatman) 
prewetted with the acid wash solution (1 M HQ, 0.2 M sodium pyrophosphate) and 

10 washed with an additional 12 ml of acid solution. The filters were then washed with 4 
ml 95% ethanol, dried, and the bound radioactivity determined by liquid scintillation 
counting. One unit of enzyme is defined as that amount of polymerase that 
incorporates 1 pmol total nucleotide into acid insoluble DNAAninute. 

The same assay was used to monitor purifications described in Example 5, with 

15 the exception being that as appropriate, the assays were conducted at 30°C, and the 

enzyme was diluted to a point where the amount added to the assay gave a response in 
the linear range (i.e., amount of enzyme where radioactivity incorporated was 
proportional to amount of enzyme added). In these assays, enzyme was added to the 
assay mix on ice, and the entire solution was transferred to a 30°C water bath. 

20 Conducting the assay at 30, 40, 50, 52.5, 55; 57:5, 60, 62.5, 65, 70, 

and 80°C, revealed a temperature optimum of approximately 60°C, a temperature that 
is clearly higher than that of E. coli DNA polymerases. The activity of the enzyme 
(units/nl) at the above temperatures was 19, 74, 290, 390, 400, 490, 500, 500, 460, 
390 and 170, respectively. 

25 
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11. Purification of Tth DNA Polymerase III a Subunit 
to Homogeneity 

The Tth DNA polymerase is purified to homogeneity. Ammonium sulfate 
fractionation experiments, similar to those described above (See, section 7, above), 
with the exception being that the catalytic gap filling assays are conducted at an 
elevated optimal temperature for Tth a' (e.g., 60°C). As inactivation of the E. coli 
protein is achieved by adding the protein to be assayed to pre-warmed tubes, an 
unambiguous determination of the distribution of the Tth protein is obtained. The 
ammonium sulfate concentration required, for reproducible precipitation of at least 80% 
of Tth activity is determined. Backwashing procedures are then developed, using 
approximately 1/20 of the initial lysate volume of ammonium sulfate that results in 
extraction of the maximal level of contaminants, while leaving most of the Tth a 
subunit in the pellet. The concentration of ammonium sulfate to be used in the 
backwash solution that yields at least 60%: of the initially pelleted Tth a activity and . 
15 optimal purity is then determined. In the situation in which activity does not provide a 
suitable assay because of low level wild-type a expression or other causes, antibodies 
directed against the purified tagged a are produced, and Western; blots used to 
quantitate the distribution of Tth a, in a manner similar to that used for the biotin 
blots to monitor biotin-tagged Tth a. In some embodiments, particularly in cases . 
20 where the level of Tth a expression is unsatisfactory, up to the first 30 codons of the 
Tth a gene are replaced with AT-rich codons commonly used by E. coli, yet coding 
for the same amino acid sequence (i.e., to improve expression). 

Once an optimal method for production of an ammonium sulfate fraction 
containing wild-type Tth a is achieved, the obtained material is redissolved in buffer I 
25 ( 50 mM imidazole-HCl (pH 6.8), 20% glycerol, 5 mM.DTT, 0.1 mM EDIA), and 

dialyzed against buffer I plus 25 mM NaCl. The solution is then applied to a BioRex- 
70 column (BioRad), equilibrated with buffer I, after further dilution to the 
conductivity of buffer I plus 25 mM NaCl. Or, other ionic strengths are determined in 
pilot experiments to result in binding of Tth a to the column. The column is then 
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washed with buffer I containing a salt level that does not elute Tth a (approximately 
25 mM NaCl), and then the activity is eluted with a 10-column volume gradient of 
from 25 to 300 mM NaCl in buffer I. It is contemplated that in some situations, 
higher salt concentrations are needed, in order to elute activity. Indeed, it is 
5 contemplated that this may be necessary with any of the columns described in these 
Examples. 

The peak is detected by the gap filling assay previously described. Fractions 
containing the highest specific activity {i.e., no less than half of the specific activity of 
the most pure fraction) are pooled. In any case, no activity that comprises less than 

10 40% of the tube containing the peak activity is pooled. The pooled activity is then 
precipitated by the addition of ammonium sulfate to 60% saturation and the 
precipitated protein is recovered by centrifugation. 

Tth a is then further purified using Toyo-Pearl Ether chromatography using 
buffers and conditions similar to those described in Example 2, except that the column 

15 is loaded at approximately 4 mg/ml of column packing material, and the divalent ions 
and ATP may be excluded from the buffers. The same provisions apply to increasing 
or decreasing the load, as described above for the BioRex-70 column. In cases where 
Tth a does not bind satisfactorily to the column, ToyoPearlPhenyl 650 M columns are 
substituted. It is also contemplated that similar materials made by other manufacturers 

20 (e.g., Pharmacia) will find use in the present invention. 

Precipitated material is then subjected to anion exchange chromatography, 
unless SDS-PAGE gels indicate that the protein is already greater than 95% pure. The 
redissolved pellet is dialyzed in buffer I, and applied to a Q-Sepharose column 
equilibrated with buffer I, at a load of approximately 4 mg/ml resin, with the same 

25 provisions as indicated above. The column is then washed with buffer I containing a 
salt concentration that does not elute Tth a (e.g., approximately 50 mM), and the 
column is eluted with a gradient starting with the wash buffer and ending with 300 
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mM NaCl. The peak fractions are selected and precipitated by addition of ammonium 
sulfate to 60% saturation, and the pellet is recovered by centrifugation. 

The final step in the purification involves Sephacryl S-300 chromatography. A 
Sephacryl S-300 column is equilibrated in 20 mM potassium phosphate (pH 6^5), 0.1 
5 mM EDTA, 5 mM DTT, 20% glycerol, and 100 mM KCI). Tth a is then dissolved in 
5-20 mg/ml of the same buffer and 1-2% of the total column volume. The column is 
eluted with the equilibration buffer. Pooled peak fractions are rapidly frozen in liquid 
nitrogen and stored at -80°C for further use. 

10 EXAMPLE 6 

Cloning and Sequencing the T. Thermophilic dnaA Gene 

A. Design of PCR Primers 

The PCR primers for dnaA were designed by using highly conserved amino 
15 acid sequences from regions of the dnaA gene in a variety of bacteria. For the design 
of the forward primers, the consensus sequence (SEQ ID NO: 1 1 8) was derived/from 
the following regions of homology (SEQ ID NOS: 11 9-1 32): 
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The amino acid sequence used for the design of dnaA forward primers was Gly 
20 Leu Gly Lys Thr His (SEQ ID NO: 133). The Forward Primer A177Fa had the 

sequence 5 ' -GGN YTNGGN AARACSCAT- 3' (SEQ ID NO: 134); the Forward Primer 
A177Fb had the sequence 5 ' -GGN YTNGGN A ARACSCAC-3 ' (SEQ ID N0:135); the 
Forward Primer A177Fc had the sequence 5' -GGN YTNGGN AARACWCAT-3' (SEQ 
ID NO: 136); while the Forward Primer A177Fc had the sequence 5'- 
25 GGN YTNGGNAARAC WCAC-3 ' (SEQ ID NO: 137). Four primers were used to keep 
degeneracy at or under 512-fold. They varied in codon 5 and 6. 



- 127.- 



BNSDOCID: <WO 9913060A1_I_> 



WO 99/13060 



PCT/US98/18946 



For the design of the reverse primers, the consensus sequence (SEQ ID. 
NO: 138) was derived from the following regions of homology (SEQ ID NO'S: 13 9- 
154): 

5 Consensus Sequence EL/F HTFN 

for dnaA reverse < F 
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25 •' 

The amino acid sequence for dnaA reverse primers was Glu Leu/Phe Phe His 
Thr Phe Asn (SEQ ID NO: 155). The reverse PCR primers for dnaA were reverse 
primer A251Ra [5 ' -TTRA ANGTRTGRAAN AA YTC-3 ' (SEQ ID NO: 156)]; reverse 
primer A251Rb [5'-TTRAANGTRTGRAANAGYTC-3' (SEQ ID NO: 157)] (SEQ ID 

30 NO:67). 

B. PCR Amplification and Cloning of T. Therntophilus dnaA Probe 

PCR amplification of dnaA was carried out using the Boehringer Mannheim 
Expand™ long template PCR system as described above except that the following 
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conditions were used: The annealing steps were conducted at 55°C, elongation at 68°C 
and the melting step at 94°C; 26 total cycles were run. PCR-amplified dnaA probe 
was separated on 2% FMC Metaphor agarose gels, visualized by Sybr green- 1 staining, 
cloned into vector pCRII (Invitrogen), and sequenced as described above. The primer 
5 pair, that yielded product of the expected length (237 bp) was A 177 Fb and A 251Rb. 

A roughly 237 bp and a 210 bp band were excised from the gel and cloned into 
pCrll (Invitrogen) as previously described for dnaX and dnaE. Five colonies from 
each of the 2 ligations were tested for the presence of insert by digestion with EcoRl, 
which releases cloned inserts from the pCRII vector. All five of the dnaA clones 

10 from the 237 bp DNA ligation had inserts. Four out of 5 clones using the 210 bp 

DNA had inserts. Two colonies from the 237 bp insert clones were chosen at random, 
clones (A237.A and clone A237.E). These clones were grown up for plasmid 
purification using the Qiagen procedure (described previously). Plasmid pA237.A 
(DMSO dnaA L-A) was sent for sequencing (Fort Collins) and shown by sequence to 

15 have homology to dnaA. 

The results are shown in Figure 15... Figure 15A shows the deduced nucleotide 
sequence of a portion of T. thermophilus dnaA (SEQ ID NO:65). In this Figure, the 
sequences corresponding to primers are underlined (SEQ ID NOS:66 and 67). The 
DNA sequence in Figure 15A was found to be equivalent to the message. A BLAST 

20 search of the sequence between the regions corresponding to the primers was 

conducted. These results revealed a strong homology to the structural genes encoding 
bacterial DnaA origin binding protein. The corresponding segment of B. subtilis dnaA 
showed 58% identity over a 68 amino acid stretch. The corresponding E. coli dnaA 
showed 45% identity over 68 residues. As with other Figures, the intervening 

25 designations indicate identical residues. Figure 15B shows the alignment results for T. 
thermophilus, E. coli, and B. subtilis. In Figure 15B, the numbers refer to B. subtilis 
(SEQ ID NOS:72 and 76) and E. coli (SEQ ID NOS:68 and 73) amino acid residues, 
as appropriate; the numbers for T. thermophilus (SEQ ID NOS:70 and 75) refer to the 
first base of the anticodon for the shown amino acid residue to the left of the 
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corresponding primer sequence near the 3' end of the shown sequence. In Figure 15, 
homologous sequences are indicated (SEQ ID NOS.69, 71, and 74). 

C. Identification of the T. thermophilic dnaA Gene 
5 Probe A237 was also used to screen a lambda library using the methods 

described above for dnaE {i.e., probe isolated and labeled as for the other 
oligonucleotides). Over 100 candidate positive clones were identified; two were grown 
up and the DNA purified as described for dnaE. The first clone (probel-cl#3. 1) was 
sequenced using a primer designed from the sequence of plasmid pA237.A '(See, 
1 0 Figure 15 A). It was discovered that the clone terminated and vector sequence was 
encountered before the carboxyl terminus of dnaA was reached. The second clone 
(probe 1 -cl#8. 1 ) was sequenced to the end of the gene. Together with merged 
sequence obtained from the initial PCR probe (Figure 15 A), a sequence of 1229 bp 
(SEQ ID NO:221) was obtained (See. Figure 19 A). The 5' portion of this sequence 
translated to an open reading frame, with the exception of stop codons at amino acid 
positions 174 and 208 (SEQ ID NO:222) (Figure 19B). Performance of a BLAST 
search substituting "X" (i.e., indicating an unknown amino acid for these positions), 
indicated a high level of homology with eubacterial dnaA replication origin binding 
proteins. A gap in the alignment was identified between positions 162 and 219. This 
20 information coupled with the stop codons at positions 1 74 and 208 indicate an error in 
reading frame for this region. Nevertheless, this information was sufficient for 
unambiguous identification of the gene. 

Following the Tth dnaA gene, at least 10 copies of the DnaA protein binding 
site (TTAT(CVA)CACA, and reverse complement; shown in bold and underlined in 
25 Figure 19A), appear to be be followed on the 3' end by AT-rich sequences, a feature 
common to eubacterial replication origins. Thus, this region probably represents the 
Tth chromosomal replication origin. Within the AT-rich segment were two Dral 
restriction sites, sites that are expected to be rare in GC-rich organisms such as 
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thermophiles. These sites were exploited in a Southern blot, to determine the structure 
of the downstream sequences. 

Lambda DN A carrying the Tth dnaA gene and downstream sequences (probe 1 - 
cl#8.1) was digested with Dral and a second restriction enzyme and subjected to 

5 Southern analysis using an oligonucleotide selected from sequences downstream of the 
second Dral site. A Dral-EcoRl digest yielded and approximately 2.2 kb fragment, 
indicating that at least 2.2 kb of Tth DNA remained downstream before the lambda 
cloning vector polylinker, region. Digestion with Dral and BamHl yielded an 
approximately 900 bp fragment. This information was used for obtaining the Tth 

10 dnaN gene as described in Example 9. 

EXAMPLE 7 

Cloning and Sequencing of the T. Thermophilus dnaQ Gene 

15 

A. Design of PGR Primers 

Primers for dnaQ were designed from sequences conserved in the epsilon 
subunits of bacteria and phage. 
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TABLE 3. Primers for dnaQ 



CONSENSUS v 
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E. coli 


I V LDTETT GMN Ql 
(SEQ ID NO: 158) 


L V 1 H N A A-FD1 GFM 
(SEQ ID NO 1591 


H. haemolyticus 


IVLDTETTGMNQI 
(SEQ ID NO: 160) 


L V 1HN AP-FblGFM 
(SEQ ID NO: 161) 


■B. aphidicola 


1 v ldTettgmni s V 

(SEQ ID NO: 162) 


L V 1HN AS-FD VGFl 
(SEQ ID NO: 163) 


B. subtilis 


V VFD VETTGLSAV 
(SEQ ID NO: 164) 


L V1HN AA-FDMG 
(SEQ ID NO: 165) 


M. genitalium 


V1FD1ETTGLHGR 
(SEQ ID NO: 166) 


MVAHNGINFDLPFL 
(SEQ ID NO: 167) 


M. pulmonis 


VVYDIETTGLSPM 
(SEQ ID NO: 168) 


MVAHNAA-FDHNFL 
(SEQ ID NO: 169) 


S. aureus 


VVFD VETTGLSNQ 
(SEQ ID NO: 170) 


FVAHNAS-FDMGFI 
(SEQ ID NO: 171) 
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Two forward primers, were designed for each forward and reverse sequence for 
dnaQ in order to reduce degeneracy. The amino acid sequence used for the design of 
forward primers was Asp Thr/Ile/Val Glu Thr Thr Gly ((SEQ ID NO: 172). The first 
forward primer (Q12Fa) had the sequence 5'-GAYACNGARACNACNGG-3 ' (SEQ ID 
NO: 173), while the second forward primer (Q12Fb) had the sequence 
5 '-GAYRTNGARACNACNGG-3 ' (SEQ ID NO: 174). Both forward primers were 
equivalent except for the second codon which encodes Thr, in order to keep the 
degeneracy below 512-fold, since Thr was found in the gram negative bacteria (primer 
Q12Fa), while lie or Val were found in some gram positive bacteria (primer Q12Fb). 
The amino acid sequence for the reverse primers was His Asn Ala Ala/Ser Phe Asp 
(SEQ ID NO: 175). The first reverse primer (Q98Ra) had the sequence 
5 ' -TCRAANGCNGCRTTRTG-3 ' (SEQ ID NO: 176), while the second reverse primer 
(Q98Rb) had the sequence 5 '-TCRAANS WNGCRTTRTG-3 ' (SEQ ID NO: 177). Both 
the reverse primers were equivalent except for the fourth codon (from the 3' end) 
which encoded Ala (primer Q98Ra) and Ser (primer Q98Rb). 



- 132 



BNSDOCID: <WO 9913060A1_I_> 



WO 99/13060 



PCT/US98/18946 



B. PCR Amplification of 71 Thermophilus dnaQ probe 

PCR amplification, gel analysis and sequencing of a dnaQ probe was carried 
out as described above for the dnaA probe. Only primer combinations Q12Fa and 
5 Q98Ra gave a PCR product of the expected size. The Q12Fa/98Ra primer 
combination gave a single intense sharp band of approximately 270 bp and 8 
additional bands of high molecular weight (6 to 14 kb). 

An approximately 264 bp band resulting from amplification of T. thermophilus 
chromosomal DNA preparation B using primers Q12Fa and Q98Ra was excised from a 

10 2% Metaphor agarose (Ix TAE) and cloned as described for the PCR probes for dnaE 
and dnaX. Five colonies from the clones were chosen and plasmid isolated using the 
Promega plus Minipreps. The plasmid DNA from the five colonies were tested for 
the presence of insert by digestion with restriction endonuclease £coRI, which releases 
cloned inserts from the pCRJI vector. Four out of five showed the presence of insert. 

15 Two clones, named plasmid pMGC/QFAl 1 A and QFA1 IE were chosen for further . 
analysis.. Plasmid pMGC/QFAllA (DMSO 1329) was sent was sent for sequencing 
(Fort Collins) and was shown to have sequence homology to dnaQ (Figure 16). 

The results are shown in Figure 16A (SEQ ID NO:77). In this Figure, 
sequences corresponding to primers are underlined (SEQ ID NOS:78 and 79); the 

20 sequence shown in this Figure is the complement of the message strand. A BLAST 
search of the sequence between the regions corresponding to the primers revealed 
strong homology to the structural genes encoding bacterial proofreading exonucleases. 
The exonuclease domain of B. subtilis DNA polymerase III (SEQ ID NO:84) showed 
40% identity over a 50 amino acid stretch. The epsilon proofreading subunit (e) of 

25 the E. coli DNA polymerase III holoenzyme (dnaQ) (SEQ ID NO:80) ? showed 32% 
identity over 49 amino acid residues. The amino acid sequences are shown in Figure 
16B. In Figure 16B, for B. subtilis and E. coli, the numbers refer to amino acid 
residues, while for T. thermophilus (SEQ ID NO:82), the numbers refer to the first 
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base of the anticodon for the shown amino acid residue to the left of the corresponding 
primer sequence near the 3' end of the shown sequence. As with other Figures, the 
intervening designations indicate identical residues (SEQ ID NOS:81 and 83). 

Southern Blots were also conducted as described above, using above probe and 
digest of T.. thermophilus DNA preparation B. 

Mndlll 7.55 Kb . 

HindUl/Ehel 2.25 kb 

Ehel 3.4 kb 

ApalA Very high molecular weight 

ApalAI Hindlll 1.1 kb 

ApaLl/Ehel 2.2 kb 

Plasmid pMGC/QFAl 1 A (DMSO 1329) was also used as a probe to screen a 
lambda library using the same techniques described for dnaE. However, experiments 
. using, this .probe failed to produce positive colonies as it had for dnaE and dnaA. 
Next, an oligonucleotide probe was designed, based on the sequence shown in Figure 
16. With this probe (5'-CCT CGA ACA CCT CCT GCC GCA AGA CCC TTC 
GAC CCA-3'; SEQ ID NO:2G9), over 100 strong positive plaques were identified and 
verified by replating. Three were grown up and the DNA purified as described for 
dnaE. 

One (probe3-ci#5.1.1) was selected for further sequencing. The sequence (Fig: 
18A) of a major portion of the gene was obtained by direct sequencing of the insert in 
the isolated lambda DNA using sequences selected from the PCR product (Fig. 18 A; 
SEQ ID NO:214) to initiate sequencing. Upon preliminary examination of the 
sequence, it was found to encode one continuous open reading frame (Figure 18B; 
SEQ ID NO:215), that showed significant homology to other DNA polymerase III e 
subunits from other bacteria (based on a BLAST search). However, alignment of the 
open reading frame with known protein sequences revealed no homology to the first 
thirty five (35) amino acids. No candidate ATG start sites were present before regions 
of homology were encountered. However, if the GTG found as the 36th codon was 
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the initiating codon, a protein would be expressed (Figure 18C; SEQ ID NO:216), that 
gives good alignment with approximately the same distance from the initiating codon 
and regions of strong sequence conservation with other eubacterial e subunits (Figure 
1 8D). Interestingly, the carboxyl-terminus aligns with the carboxyl-terminal residue of 
5 the most closely homologous dnaQ genes from Treponema pallidum (SEQ ID NO:218) 
and Aquiflex aeolicus (SEQ ID NO:219), potentially indicating that all or almost all of 
the relevant sequence has been identified. The alignment also includes homologous 
sequence from E. coli (SEQ ID NO:220). A strong secondary structure or other block 
prevented obtaining more 5' sequence of the Tth dnaQ gene. 

10 To overcome this problem, a minimal fragment is subcloned into a plasmid and 

the plasmid sequenced, coming in from both the 5' and the 3' ends of the unknown 
sequence. To identify a suitable fragment for subcloning, a purified lambda DNA 
carrying the gene is digested with Spfl, to cleave the third codon of the gene. This 
digested DNA is then divided into a series of aliquots and a second digestion is then 

15 performed with one additional restriction enzyme (^4vrII, C/al, Kpnl, Ncol, Nhel, Rsal, 
Sphl, Spll, Speli or AccY). A control tube is mock-digested without a second enzyme. 
These digested DNAs are then subjected to SDS-PAGE and a Southern blot is 
performed, using a synthetic oligonucleotide selected from the known Tth dnaQ coding 
sequence. Useful candidates for subcloning and sequencing are selected from those 

20 enzymes that yield products of between 700 and 2000 bps. Selected fragments are 
subcloned and sequenced, until the 5' end of the Tth dnaQ gene is determined. . 

Upon determination of the entire sequence of the Tth dnaQ gene, a vector is 
constructed that expresses the candidate Tth e subunit fused by its amino-terminus to 
the biotin/hexahis tag (i.e., as used in conjunction with the a subunit of Tth DNA 
25 polymerase III, as described in Example 5). This is accomplished by replacing the 

polylinker of pAl-NB-Arv-2 with a synthetic oligonucleotide containing, minimally, an 
Arvll sticky end, followed by an Sbfl site, to reconstruct the second and third codons 
of Tth dnaQ, followed by a spacer and a site for an enzyme that cuts downstream of 
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dnaQ, as determined by Southern blotting: (above). This site is followed by the spacer 
and the Fsel site and the sequence that follows F^I found : in the starting vector. The 
cloned Tth dnaQ gene is cleaved with Sbfl and the selected downstream enzyme, and 
then cloned into the corresponding, sitesof the expression vector. 

The Tth e-fusion protein is expressed and purified by lysis, ammonium sulfate 
fractionation, and Ni-NTA chromatography using methods described in Ex 
and 8. In the situation where the e fusion protein is insoluble, it is solubilized in urea 
; (6 M or higher), and chromatographed in the denatured state in the presence of urea on 
the Ni-NTA column. A second chromatographic step using soft-release avidin (See, 
Kim and McHenry, J. Biol. Chem , supra), is used as an antigen to obtain polyclonal 
or monoclonal antibodies that are used to purify the wild-type Tth c subunit from Tth 
cells by immunoprecipitation or column immunoaffiriity procedures, as known in the 
art. The isolated protein is further purified. - Then, the amino-termirius and denatured 
• molecular weight are determined by procedures described m other Examples, to 'T " 
15 confirm that the correct protein is expressed. 

In the case that the amino terminus is generated from the proposed GTG start 
site, a vector is constructed that expresses wild-type s by amplifying Tth dnaQ in 
- modified form with PCR primers that replace the initiating GTG with ATG, and 
precede the sequence with sequence that corresponds to a Oal site. The 3' end PCR 
primer reproduces the translation^ termination site, followed by either Bamtil, Xhol, 
or Xbai, depending upon which site is absent from the Tth dnaQ open reading frame. 
The PCR fragment is cleaved with restriction enzymes that recognize the terminal : 
noncomplementarity sites used for PCR, and cloned into the corresponding sites of 
pAl-CB-Clal (See, Example 8). : 

In the case that the analysis of the wild-type ,e P silori.;iridicates mor^exteri^ .: 
sequence on the amino-termirial end, a Southern analysis is performed, on the DNA 
digest described above using ah oligonucleotide probe selected from the 5' side of the 
Sb/l site. Hybridizing fragments of increasing size are subcloned until a sequence is 
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revealed that matches the true experimentally determined terminus of the Tth dnaQ 
gene. This wild-type Tth e is then expressed, using modifications of the procedures 
described above. 

In either case, standard lysis procedures as described in Example 5, and 
5 optimized ammonium sulfate fractionation procedures as described in Examples 5 and 
8, will be used, with an antibody directed against a portion of Tth epsilon, in order to 
maximize purification, with retention of at least half of the Tth epsilon protein. Tth 
epsilon is purified by cation, anion, hydrophobic, and gel filtration chromatographic 
procedures until homogenous. The purification is monitored by quantitative 

10 immunoblotting procedures and 3' to 5' exonuclease assays {See, Griep et al, 

Biochem., 29:9006-9014 [1990]), to ensure selection of procedures that provide good 
yields of active epsilon, and for selection of the most pure fractions, to provide 
material for additional purification steps. Native DnaQ (Tth epsilon) is used for the 
reconstitution of DNA polymerase III holoenzyme, and developed as an additive to 

15 improve the fidelity of thermophilic polymerases that do not contain proofreading 

exonucleases, and as an additive to remove unincorporated bases, permitting long PCR 
reactions to be performed with a variety of thermophilic polymerases. 

The amino-terminal e fusion proteins are also used to prepare affinity columns 
for the isolation of novel proteins that bind to DnaQ (a subunit) alone, as well as 

20 DnaQ present in a complex with the isolated DnaE protein. The structural genes for 
novel proteins found by this method are isolated, sequenced, expressed, purified, and 
used to determine whether they make contributions to the functional activity of the T. 
thermophilics DNA polymerase III holoenzyme. 
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EXAMPLE 8 

Construction of Vectors Expressing Native T. thermophilus t and y Subunits 

As the sequence of the T. thermophilus dnaX gene is complete (See, Figure 
9A), vectors were constructed that overproduce both the x and y subunits of the T. 
5 thermophilus DNA polymerase III holoenzyme in E. col i. Methods used to construct 
these vectors are similar to those followed previously for the corresponding E. coli 
subunits to overproduce native x and y {See e.g., Dallmann et al, J. Biol. Chem., 
270:29555-29562 [1995]). 

10 !• Construction of the Starting Vectors 

First, a vector, pDRK-C, {See, Kim and McHenry, J. Biol. Chem., 271: 20690- 
20698 [1996]) containing a pBR322 origin of replication, a gene expressing the lac I Q 

" m - repressor protein, and a semisynthetic E. coli promoter (pAl) that is repressed by the 
lad repressor was modified. Plasmid pDRKC DNA was prepared and digested with 

■ 15 BarnHl; the resulting 3 ' ends were filled in to the end of the coresponding template 
strand with the Klenow fragment of DNA pol I in the presence of Mg ++ and the four 
dNTPs (ATP, GTP, TTP, and CTP), and resealed with T4 DNA ligase, in the presence 
of 1 mM ATP. Plasmids were transformed into E coli, plasmid-containmg colonies 
were selected by ampicillin resistance, and the plasmds were prepared and screened for 

20 loss of the BamHl site One of the colonies that contained plasmid that had not been 
cleaved by Bamm was selected, grown, and used for preparation of the resulting 
plasmid pDRKC-Bam minus . 

pDRKC-Bam m,nus was prepared and digested with Xbal and DralU to remove a 
small polylinker (this removed polylinker contains Xbal-Ncol-Notl-DraUl sites). The 
25 following oligoriucleo into the digested plasmid: 

CTAGGAGGTTTTAATCGATGCGGCCGGATCCTCGAGTCTAGACACTGG 

— -CTCCAAAATTAGCTACGCCGGCCTAGGAGCTCAGATCTGTG (SEQ ID 
NO:178; ATG# P38-A1). 
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The conversion of the BamHl site to GGATCGATCC, and the replacement of 
the original polylinker with SEQ ID NO: 178 was confirmed by DNA sequencing, 
using methods known in the art. Creation of the filled-in BamHl site was found to 
have created a Clal site, but it is not cleaved if plasmid is puriifed from methylase- 
5 proficient E. coli strains. 

The resulting plasmid pAl-CB-Cla-1 (previously referred to as pAl-CB- 
EBXXDS) contained the restriction sites within a polylinker to enable the following 
cloning steps. The following is a reproduction of the above oligonucletotides with the 
relevant sequences annotated: 
10 5 "CTAGGA GGTTTTA ATCG^T GCGGCC GGATCC TCGAG TCTAGA CACTGG 

CTCCAAAATTAGCTACGCCGGCCTAGGAGCTCAGATCTGTG-5 , (SEQ ID 

NO: 179). In this sequence, the following annotations apply: 
CTAG— Sticky end for Xbal, but destroys site, so it is not recleaved 
AGGAGG = rbs 
15 ATCG^f = Clal site 

ATG = initiation codon 
CGGCCG = Eagl site 
GGATCC = BamHl site 
CTCGAG = Xhol site 
20 TCTAGA = Xbal site 

CACTGG = 3' -overhang to regenerate Dralll site 

In parallel, a T7 promoter cloning vector was developed, such that the 
determination of which (i.e., T7 or pAl) provided the best levels of soluble protein in 
a form amenable to further purification. The starting vector, pETll-KC (Kim and 
25 McHenry, J. Biol. Chem., 271:20690-20698 [1996]) contains a pBR322 replication 

origin, a copy of lad and a T7 promoter. A Clal site in the vector was destroyed and 
the polylinker was replaced with a synthetic one to enable further cloning steps. 
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PETM-KC was prepared and cut with Clal, then filled in and religated to destroy the 
site. The resulting plasmids were transformed into K col /;" and individual colonies 
were screened for plasmid that had lost the ability to be cleaved by Clal. One colony 
was selected and used to prepare plasmid DNA. The resulting DNA, pETll-KC- 
5 Cla m,nu \ was then cut with Xbal and Dralll, and the same duplex replacement 

oligonucleotide described above are cloned into it, resulting in plasmid pET-CB-Cla-1 
(previously referred to a pET-CB-CEBXXDS). The sequence of pET-CB-Cla-1 was 
confirmed by DNA sequencing, using methods known in the art.- 

10 2. Construction of Plasmids that Overexpress T. thermophilus 

t and 7 From the pAl Promoter 

Plasmid "pAX2S H (also known as "pUNCOl") DNA was prepared from the 
stock strain DMSO 1386. A 465 bp segment corresponding to the amino-terminal end 
of T. thermophilics DneX was PCR-amplified Using primers P38-S 1587 

15 (TATCGATGAGCGCCCTCTACCG; SEQ ID NO:210)) and P38-A2050 

(CGGTGGTGGCGAAGACGAAGAG; SEQ ID NO:211). The forward primer (P3 8- 
S1587) added a Clal site overlapping the initiation codon and changed the initiator 
TGT to ATG at the 5' -end of dndX. Addition of 5% DMSO was required for 
efficient amplification of the GC-rich template. The resulting PCR product was cloned 

20 into the pGEM-T Easy vector (Promega) using the pGEM-T Easy vector kits. This 

vector is supplied with a T-overhang for direct cloning of PCR fragments containing a 
nontemplated A-overhang from Taq polymerase. A clone was selected that carried a 
plasmid with an approximately 319 bp insert that could be removed by digestion with 
Clal and BamHl. The sequence of the DNA was verified through the region to be 

25 subcloned, and the approximately 319 bp Clal/ BamUl fragment from the ECR clone 
was subcloned into the corresponding sites of pAl-CB-Cla-1 resulting in the plasmid 
designated as M pAl-5'GX: M 

The C-terminal-coding portion of T. thermophilus dnaX was removed from 
pAX2.S by cleavage with BamHl and Xball, and the resulting 1558 bp fragment was 
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cloned into the corresponding sites of pAl-5'-.GX, generating plasmid n pAl-TX" that 
should express T. thermophilus x and y. Plasmid was transformed into E. coli, and 
colonies were screened for the approximately 1559 bp fragment, as well as integrity of 
the Spel and Aflll sites. One clone, "pAl-CB-TX" was selected. The DNA sequence 
5 across the 5 '-cloning sites was verified by DNA sequencing. 

3. Construction of pAl Promoter-Containing Plasmids that 

Overexpress T. thermophilus r Fused to a Carboxyl-Terminal 
Peptide That Contains Hexahistidine and a Biotinylation Site 

10 The present invention also provides methods and compositions for expression 

T. thermophilus x fused on its carboxyl-terminus to tagged peptides. This permits 
rapid purification and oriented immobilization to create an affinity column for isolation 
of additional T. thermophilus proteins that bind x. During the development of the 
present invention, it was determined that E. coli x tolerates fusion of foreign proteins 

15 to its C-terminus with preservation of activity. These observations were utilized to 
produce the fused T. thermophilus x. In particular, the present invention provides 
methods and compositions for expression of T. thermophilus r fused on its carboxyl- 
terminus to tagged peptides containing hexahistidine and a site that is biotinylated in 
vivo by the E. coli biotinylation enzyme. This permits rapid purification and oriented 

20 immobilization to create an affinity column for isolation of additional T. thermophilus 
proteins that bind x. 

In these experiments, the vector pDRK-C described above encodes a 30-residue 
peptide that is brought into frame with the C-terminus of dnaX. This was 
accomplished by engineering a PCR product to contain a properly phased Spel site in 
25 place of the normal termination codon of T. thermophilus dnaX. PCR was then 
conducted using the cloned T. thermophilus dnaX gene as a template. One PCR 
primer was internal to the dnaX Aflll site, while the second primer contained a 
cleavable Spel site that is preceeded by the final 8 codons of T. thermophilus dnaX 
(excluding the stop codon). The PCR product was then cleaved with Spel and Afll, 
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and inserted into the corresponding sites of the vector to generate plasmid "pAl-CB- 
Tth-dnaX" (also known as "pAl-CB-TX"). 

Specifically, a PGR reaction was conducted using primers P38-S2513 (forward 
primer; 5-CCGCCATGACCGCCCTGGAC; SEQ ID NO:212) and P38-A3183 
5 (reverse primer: S'-ACTAGTTATACCAGTACCGCCTAT; SEQ ID NO:21 3). It was 
necessary to add DMSO to 5% final concentration to obtain amplification of the 
desired product from the GC-rich template, the PCR fragment was cloned into 
pGEM-T Easy, generating the plasmid f, pT-geneX3\" Plasmids were transformed into 
E. colU a plasmid containing colony selected and the DNA sequence of the cloned 

10 PCR product was confirmed. The approximately 610 Aflll-Spel fragment was 

removed and cloned into the corresponding sites of pAI-TX, replacing the 3' -region of 
T. thermophilus dnaX and bringing the C-terminus of dnaX into frame with the 
desired fusion peptide. The plasmid was then transformed into E. coli, and colonies 
were screened for those that contained an approximately ¥l 0 bp AflllSpel fragment 

15 and an approximately 1600 bp Clal-Spel fragment. The resulting plasmid was 
designated "pAl-CB-TX." 

4. Placing the Sequences Expressing T. thermophilus t and r- 

Biotin/Hexahistidine Polypeptide Under Control of the T7 Promoter 

20 Next, the Clal-Spel fragment containing the entire coding region of the T. 

thermophilus dnaX gene was removed from the two pAl promoter driven expression 
vectors, pAI-TX and pAI-CB-TX, and placed into the corresponding sites of the T7 
promoter driven expression vector pET-CB-Cla-1 to generate plashiids pET-TX and 
pET-CB-TX that express the wild type 7. thermophilus DnaX proteins and DnaX 

25 protein fused to the described hexahistidine-biotinylation peptide respectively. 
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Construction of pET-TX 

An approximately 1.9 Kb Clal-Spel fragment was removed from pAl-TX and 
cloned into pET-CB-Cla-1 that had been cleaved with Clal and Spel. Plasmid was 
transformed into E. coli, and individual colonies were screened for those that contained 
5. plasmid carrying a 1.9 Kb Clal-Spel fragment and the expected Spel 9 - Clal and BamHl 
sites. A colony was selected and the plasmid it contained was named "pET-TX." This 
vector expresses the wild type T. thermophilics dnaX gene under the control of a T7 
promoter. 

10 Construction ofpET-CB-TX 

An approximately 1.6 Kb Clal-Spel fragment was removed from pAl-CB-TX 
and cloned into pET-CB-Cla-1 that had been cleaved with Clal and Spel. Plasmid was 
transformed into E. coli, and individual colonies were screened for those that contained 
plasmid carrying a 1.6 Kb Clal-Spel fragment. A colony was selected and the plasmid 

15 it contained was named "pET-CB-TX." This vector expresses T. thermophilus dnaX 

fused on its carboxyl-terminus to the described hexahistidine-biotinylated peptide under 
the control of a T7 promoter. 

5. Comparing Overproduction of Soluble T, thermophilus dnaX 
20 Protein From pAl-TX, pAl-CB-TX, pET-TX and pET-CB-TX 

Next, DNA from the plasmids pAl-CB-TX and pET-CB-TX was prepared and 
transformed into strain MGC-1030 (mcrA, mcrB, lambda-, lexA3, uvrD::Tc, 
OmpT::Kn) and BL21 (DE3) (F\ ompT hsdS, gal, lysogen of lambda DE3 [carries 
gene for T7 RNA polymerase under the control of the lac uv5 promoter]; available 
25 from Enzyco) to generate a strain suitable for overproduction of T. thermophilus x- 
fusion protein in E. coli. 

To permit comparison of the levels of expression, these strains were first grown 
as overnight cultures in 2x YT medium (16 g tryptone, 10 g yeast extract, and 5 g 
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NaCl per liter) with and without ampicillin. Control strains (MGC-1030 and 
BL21(DE3)) without plasmids and an overproducer of the E. coli x fused at its C- 
terminus to the described hexahistidine/biotinylated peptide were also grown! After 
overnight incubation, the cultures were inoculated at a 1:50 dilution into 3 ml of like 
5 media and were allowed to grow until the cell density (measured as OD 600 ) reached 
approximately 0.6. The cells were induced with 1 mM IPTG, followed by addition of 
biotin to 10 \tM. The cells were allowed to grow for 3 h post-induction before 
harvesting. The final OD of the cells at harvest was between 1.0 and 1.5. 

The cells were lysed in reducing sample buffer, sonicated and heated to 100°C 

10 for 5 min, and the cell debris was removed by centrifugation immediately prior to 

loading on an SDS-PAGE gel. Cells (pellet resulting from 2.5 ml cell culture) were 
suspended in 2x Novex SB (sample buffer; #LC2678) (170-400 \x\ depending on OD 
of harvested cells, with 70 \x\ sample buffer/OD 600 /ml), boiled for 5 min and 
immediately loaded onto a 4-20% gradient SDS-PAGE gel," arid run at 135 constant 

15 volts at room temperature in Tris/glycine buffer. For the Coomassie-stained gels, 

0.017 OD 600 units were loaded/lane; for the biotin-blot gels, 0.006 OD 600 units were 
loaded/lane. Proteins from the resolved gels were transferred to membranes and 
detected with streptavidin-alkaline phosphatase as described in Kim and McHenry 
(Kim and McHenry, J. Biol. Chem., 271: 20690 [1996]). By visual inspection, the 

20 blots showed that vector pAl-CB-TX expressed approximately 3 -fold more T. 

thermophilics DnaX fusion protein than pET-CB-TX, but expressed approximately 20- 
fold less that the control E, coli x fusion protein. The density of the pAl-CB-TX 
expressed t fusion protein was approximately 7-fold less than the levels of the 
endogenous E. coli biotin-carrier protein. Accordingly, pAl-CB-TX was used for 

25 further work where a fusion protein was required. 
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6. Purification of T. thermophilus t Subunit Fusion with 
a Biotinylated Peptide Containing Hexahistidinc 

In these experiments, all purification procedures (unless otherwise specified) are 
generally conducted at 4°C. However, in cases where the protein dissociates or is 
5 judged to have lost activity in this or any subsequent purifications, room temperature 
may be used. Cells are grown and lysed as described with modifications made in the 
growth conditions to yield optimal levels of soluble undegraded protein as determined 
in section 5 above. DnaX protein containing in the cleared lysate is precipitated using 
ammonium sulfate added to 60% saturation, or higher, as necessary to precipitate all of 
1 0 the DnaX protein. The T. thermophilus x protein fused to a C-terminal biotinylated 
hexahistidine-containing peptide is purified by ammonium sulfate fractionation, 
chromatography on Ni ++ -NTA ion chelating chromatography much as described above. 
If necessary, additional purification can be achieved by affinity chromatography on 
monomeric avidin affinity columns as known in the art. 

15 The purified protein is used to generate a battery of monoclonal antibodies that 

react with it by the procedures described in Example 2, above. Antibody producing 
cell lines are selected that express antibody that reacts strongly with T. thermophilus x 
fusion protein as shown in ELISA assays and Western blots, as known in the art and 
described in previous Examples. The latter assay system is used to distinguish 

20 antibodies that react with contaminants present in the T. thermophilus x fusion protein 
preparation. As a control, the E. coli a subunit that has the same fusion peptide is 
included in the screen, in order to eliminate antibodies that are directed against the 
fusion peptide. Selected hybridomas are grown up at the 3 liter level to produce an 
abundant quantity of antibody. 

25 

7. Large Scale Production of Cells Expressing T. thermophilus 
r-Hcxahistidine-Biotin Fusion Protein 

Strain MGC1030 (pAl-CB-TX) was grown in a 250 L fermentor to produce 
cells for purification of biotin peptide-tagged Tth x. F- media (yeast extract (1.4%), 
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tryptone (0.8%), K 2 HP0 4 (1.2%), and KH 2 PO< (0.12%)) was sterilized and ampicillin 
(100 mg/L) and kanamycin (35 mg/L) added. A large scale inoculum was initiated 
from 1 mL of DMSO stock (to 28 L) and grown overnight at 37°C. The inoculum 
was transferred (3.5 L) to the 250 L fermentor (starting OD 600 = 0.046) containing F- 
media + 1% glucose (sterilized separately and added after sterilization of F-media) and 
ampicillin at 100 mg/L. Temperature was set at 3 7°C, aeration was set at 40 LPM, 
and agitation set at 200 rpm. A pre-induction sample of 1 liter was collected through 
the sample valve just prior induction, stored at 4°C, and spun down into a pellet the 
next day. Expression of biotin-tagged Tth dnaX was induced when the culture reached 
OD 600 = 0.794 with IPTG at 1- mM. Additional ampicillin (100 mg/L) and bibtin (10 
uM) were added at the time of induction with 1 mM IPTG Additional ampicillin 
(100 mg/L) was added 1 hour post induction. Harvest was initiated 3 hours post 
induction; cells were chilled to 14°C during harvest. The harvest volume was 170 L 
and final -harvest- weight was 2.22 g of cell paste. "An equal amount (w/w) of 50 mM 
Tris (pH 7.5)/ 10% sucrose solution was added to the cell paste. Quality control 
results showed 8/10 colonies on ampicillin-containing plates before induction and 7/10 
colonies post induction. Cells were frozen by pouring into liquid nitrogen and stored 
at -80°C until further processed. 

8. Purification of T. thermophilus 

T-Hcxahistidine-Biotin Fusion Protein 

First, 1650 ml of Tris-sucrose prewarmed to 45°C were added to a 1200 g cell 
suspension (600 g cells) of MGC 1030 (pAl-CB-TX) cells in Tris-sucrose. Then, 30 
ml of 0.5 M DTT was added, followed by 150 ml lysis solution (0.3 spermidine-HCl 
(pH 7.5), 10% sucrose, 2 . M NaCl), and 30 ml 0.5 M EDTA. The pH wa&.adjusted to 
pH 8.2, by the addition of 25 ml 2 M Tris base. Lysozyme (600 mg) was dissolved in 
25 ml Tris-sucrose. After 5 minutes of mixing by stirring at 4°C, the slurry was 
poured into GSA bottles, and placed on ice for 1 hour. The bottles were then swirled 
in a 37°C water bath for 4 min., gently inverting every 30 seconds. The bottles were 
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then centrifuged at 12,000 rpm, in a Sorvall GSA rotor for 1 hour at 0°C. The 
supernatant (Fr I; 2250 ml) contained approximately 34.7 g protein Then, 0.390 g 
ammonium sulfate was added for each ml of Fr I, to achieve 60% saturation at 4°C. 
The pellet was collected by centrifugation at 12,000 rpm for 30 min., at 0°C in a 
5 Sorvall GSA rotor. The pellet (Fr II; 25.9 g protein) was redissolved in 770 ml of 

Buffer N, and dialyzed overnight versus buffer N (50 mM sodium phosphate (pH 7.8), 
500 mM NaCl, 10% glycerol, 0.5 mM DTT, 0.1 mM PMSF, 1 mM imidazole), and 
was then applied to a 15 ml Ni-NTA agarose column (Qiagen). The flow-through was 
collected and reapplied to the column. The column was then washed with 20 column 
10 volumes of buffer containing 50 mM sodium phosphate (pH 7.8), 500 mM NaCl, 10% 
glycerol, 0.5 mM DTT, and 10 mM imidazole. Then, a 12 column volume gradient 
was run, with an increase in imidazole concentration from 10 to 300 mM. 

Biotin blots performed on the fractions indicated that the majority of the tagged 
Tth DnaX protein eluted with the 10 mM imidazole-containing wash, demonstrating a 

15 weaker affinity for the column than was expected. Nevertheless, the weak binding . 
permitted separation from the majority of protein that flowed through; minimally, a 
10-fold purification over the applied fraction was achieved. The wash fraction (Fr III, 
350 mg protein) was precipitated by addition of an equal volume of saturated 
ammonium sulfate to produce two equal volume pellets were collected by 

20 centrifugation. 

One of the two Fr III pellets was redissolved in 30 ml of dialysis buffer, and 
dialyzed overnight versus dialysis buffer (50 mM Tris-HGl (pH 7.5), 10% glycerol, 
and 5 mM DTT). The dialysate was applied to a 30 ml SP Sepharose column 
equilibrated in the same buffer as used for the dialysis step. The column was washed 

25 with one column volume, and proteins were eluted with a 10 column volume gradient 
ranging from 0 to 400 mM NaCl in dialysis buffer. The peak of eluted Tth x was 
estimated based on biotin blot results. Fractions 26-36 (52 ml) from the 56 fractions 
collected from the gradient were pooled, resulting in approximately 1 560 jig total 
protein (Fr. IV). 
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Fr. IV was concentrated using two methods. One-half of the fraction was 
precipitated by addition of ammonium sulfate to 80% saturation. The precipitate was 
recovered by centrifugation at 13,000 rpm for 2 hours in an HB4 swinging bucket 
rotor (Sorvall) at 4°C, resulting in recovery of 460 ^g protein. The other half of Fr 
5 I V was concentrated using an Amicon concentrator with a YM10 (25 mm) membrane, 
with stirring at 4°C, resulting in recovery of 560 jag protein. Biotin blots indicated 
that a better specific recovery of biotinylated Tth x (DnaX protein) was achieved using 
the Amicon membrane than by ammonium sulfate precipitation. 

An SDS-PAGE gel was run and the proteins were transferred to a nitrocellulose 

10 membrane, and biotin blots were prepared. Comparison of the biotin blotted band 

with protein standards indicated that protein had migrated 3 mm farther than carboxyl- 
terminal tagged E. coli x protein (approximately 51,000 Da), indicating that the protein 
had a molecular weight of approximately 69,000 Da, which is consistent within 
experimental error, and the variability of the technique With variances in primary 

15 sequence, of the expected molecular weight of approximately 61,000 Da. 

9. Improvement in Expression Level of Tth t and 
C-Terminal Biotin/Hexahis Tagged t 

The low level of Tth x expression was thought likely to be due to the 
20 abundance of GC rich codons, especially near the initiating ATG. To minimize this 
effect, the initial portion of the gene is resynthesized, replacing the GC rich codons 
with degenerate codons that are common in E. coli, yet encode the same amino acid. 

In specific, plasmids pAl-TX and pAl-CB -TX are cleaved with Clal (i.e.,. 
which cleaves just before to initiating ATG), and with PmR, and replace the 
25 intervening sequence with a synthesized and annealed oligonucleotides coflTaining 
commonly used E. coli codons. The sequence of these oligonucleotides are: 
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CGATGTCTGCTCTGTACCGTCGTTTTCGTCCGCTGACTTTCCAGGAAGTAGTA 
GGTCAGGAACAC (SEQ ID NO:235) 



and 



TACAGACGAGACATGGCAGCAAAAGCAGGCGACTGAAAGGTCCTTCATCATC 
CAGTCCTTGTG (SEQ ID NO:236) 

The resulting biotin peptide-tagged expression vectors are grown up, induced, 
10 analyzed by biotin blotting methods, and compared with the expression obtained with 
the existing biotin-tagged Tth x expressing vector, and the best expressing vector 
chosen for further experimentation. 

!<)• Purification of T. thermophilus t Subunit with a 
15 Biotinylated Peptide-Containing Fusion 

The optimally expressed C-terminal biotin-tagged Tth x protein is purified using 
modifications of the methods described for partial purification (i.e., methods in the 
present Example). The ammonium sulfate precipitation methods are optimized, using 
the methods described for tagged Tth a in Example 5. The resulting ammonium 

20 sulfate pellet is then subjected to Ni-NTA chromatography, using methods similar to 
those described in Example 5 for Tth oc t with the exception being that the column is 
washed with 5 column volumes of buffer containing 1 mM imidazole prior to 
beginning the gradient. Once the results of the Ni-NTA chromatography are analyzed, 
the washing procedure is altered, so that as high a concentration of imidazole can be 

25 used without elution of tagged Tth x; the gradient is also optimized, so that Tth biotin- 
tagged x elutes approximately one half way through the gradient. 

The purified protein is used to generate a battery of monoclonal 
antibodies that react with it by the procedures described in Example 2, above. 
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Antibody producing cell lines are selected that express antibody that reacts strongly 
with T. thermophilus x fusion protein as shown in ELI S A assays and Western blots, as 
known in the art and described in previous Examples. The latter assay system is used 
to distinguish antibodies that react with contaminants present in the T. thermophilus x 
fusion protein preparation. As a control, the E. coli a subunit that has the same 
fusion peptide is included in the screen, in order to eliminate antibodies that are 
directed against the fusion peptide. Selected hybridomas are grown up at the 3 liter 
level to produce an abundant quantity of antibody. 

11. Purification of Natural T. thermophilus r and 7 Subunits 

Natural x and y subunits expressed from the modified T. thermophilus dn&K 
gene in E. coli are purified by column chromatographic procedures and assayed by 
SDS-PAGE, with confirmation of the identity of the authentic x protein by Western 
blots using the above monoclonal antibodies. Cell growth and lysis are conducted as 
described above (See e.g., Dallmann et al t supra) with modifications in growth 
conditions to yield optimal levels, of soluble undegraded protein as determined in 
section 5 above. The ammonium sulfate precipitation conditions are optimized in 
order to maximize the amount of T. thermophilus DnaX protein precipitated and 
minimize the total level of protein precipitated. Protein determinations are conducted 
by the method of Bradford with modifications following the instructions that come 
with the reagent supplied by Pierce. The level of T. thermophilus DnaX protein is 
determined by quantitative Western blots using methods as described in previous 
Examples (e.g., those used to monitor, the levels of T. thermophilus DnaE protein), 
with the exception being that antibodies to DnaX prepared as described above will be 
used. ^ While the optimum level of ammonium sulfate determined by experiment is 
used, the following provides a representative procedure: Fr. I (cleared lysate) is 
precipitated with 107 g of ammonium sulfate (0.226 g for each mL of Fr. I, 40 % 
saturation) and centrifuged at 22,000 x g for approximately 30 min. Pellets are 
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backwashed by resuspension in a Dounce homogenizer with 100 mL of Buffer TBP 
containing 0.1 M NaCl, and 0.2 g/mL ammonium sulfate (35 % saturation), and re- 
centrifuged. The final pellets are stored at 4 °C until used and referred to as Fr. II. 

The chromatography columns (e.g., hydrophobic, ion exchange, and/or sizing 
5 columns) are chosen so as to yield maximally pure protein and provide the highest 
yield. Portions of Fr II are dissolved in Buffer SP (50 mM Tris-HCl (pH 7.5), 10% 
(w/v) glycerol, 5 mM dithiothreitol) (approximately 7 mg total protein/ml final), 
clarified by centrifugation (28,000 x g, 30 min), and diluted with the same buffer to a 
conductivity equivalent to 50 mM NaCl. In experiments where the desired DnaX 

10 protein does not remain soluble as judged by remaining in the supernatant after gentle 
centrifugation, it is dissolved in larger quantities of buffer, adding additional salt and 
diluting just before application to the column, and/or adding low levels of various 
detergents (i.e., 0.02% NP-40). 

This material is loaded onto a Q Sepharose (Pharmacia; 2 mg protein/ml resin) 

1 5 column equilibrated in buffer SP. After loading, the column is washed with one 

column volume of Buffer SP, then developed with a 12 column volume gradient of 50 
to 600 mM NaCl in Buffer SP at a flow rate of 1 column volume/h, and 100 fractions 
are collected. The elution position of T. thermophilics y and x are determined by 
quantitative Western blots, and total protein determined by the method of Bradford. 

20 Fractions with the highest ratio of DnaX protein to total protein are pooled. Generally, 
this results in pooling fractions of X A peak height or greater. In the situation where the 
protein fails to bind to the column, buffers with lower salt (down to O NaCl with 
dialysis to decrease endogenous levels of salt) are used, and, if that fails, buffers with 
higher pH are used. In the situation where the protein fails to elute, the ionic strength 

25 of the gradient is increased until it does so. Once the elution position is determined, 
the gradient is optimized, so that the total ionic strength change is not more than what 
is needed (generally no more than a 400 mM NaCl change) and the dnaX protein 
elutes Vi way through the gradient. The pooled peak is then precipitated by the 
addition of sufficient ammonium sulfate to precipitate all protein (generally 60% 
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saturation) and the pellet collected by centrifugation (28,000 x g for .30 min ) to yield 
Fr. III. . 

The Fr. Ill pellet is the dissolved in Buffer SP to a concentration of 
approximately 2 mg/ml, and centrifuged (28,000 x g 5 30 min) to clarify. Fr. Ill is then 
5 be loaded onto a SP Sepharose (Pharmacia; approximately 5 mg protein/ml resin) 
column equilibrated with buffer SP, After loading, the column is washed with one 
column volume of Buffer SP, and developed- with a 12 column volume gradient of 50 
to 600 mM NaCl in Buffer SP at a flow rate of 1 column volume/h; approximately 
100 fractions are collected. Chromatography is quantitated and optimized as described 

10 for the Q-Sepharose column. Additionally, this column resolves x and y, and 

conditions are optimized to enable this resolution. In situation where x and y are not 
separated by this procedure or the preceding Q Sepharose procedure, hydrophobic 
chromatography using commercially available resins is conducted. The resulting -.. 
pooled fractions of y and x are precipitated with ammonium sulfate as described to Q- 

15 SepharoseresultinginFr.IV. 

Fr IV is dissolved in Buffer H (25 mM Hepes-KOH (pH 7.5), 25 mM NaGl, 
5% glycerol, 0.1 mM EDTA) (approximately 7 mg protein/ml), and applied to a S-400 
HR (Pharmacia) column (44:1 height:diameter ratio, total column volume 5 0-times 
sample volume) equilibrated with Buffer H. The column is developed in the same 

20 buffer at a flow rate of 1 column volume/day, and 100 fractions collected. DnaX 
protein (x and y run on separate columns since resolved in a preceding column) are 
quantitated as described for the Q-Sepharose column These fractions are pooled, and 
distributed in aliquots, which are then flash-frozen in liquid nitrogen and stored at 
-80°C as Fr. IV. This material provides reagents for reconstituted T. thermophilics 

25 DN A polymerase III holoenzyme. . _ . 
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EXAMPLE 9 
Isolation of the 71 thermophilus DnaN Protein 

The sequence downstream from the T. thermophilus dnaA gene is determined , 
and, if homologous to dnaN genes of other eubacteria, used to express T. 
5 thermophilus P subunit. The dnaN gene resides downstream of the dnaA gene in 
most, but not all, eubacteria. 

A large scale lambda vector preparation was made, the DNA extracted, 
purified, and digested with Dral and EcoRl to obtain dnaN sequence. The resulting 
2.2 kb fragment was separated from other fragments by electrophoresis, and then 

10 eluted in water for direct sequence analysis. Primers for sequence analysis were 

selected from the available sequence downstream of the second Dral site (See, -Figure 
19A). The resulting sequence (Figure 20A) is preliminary and likely to contain errors, 
as it has not been processed with sequencing in the second direct yet. Nonetheless, the 
sequence information was sufficient for the identification of the encoded gene. 

15 Upon subjecting the DNA sequence to ORF finder (NIH), an open reading 

frame of 137 amino acids initiating at the underlined ATG in Figure 20A was 
obtained. This amino acid sequence is shown in Figure 20B. Using the BLAST 
program linked to ORF finder, the open reading frame was compared with all known 
and deduced protein sequences. Nine eubacterial beta polymerase sequences were 

20 identified as the best matches. The probability of the best match (with Salmonella 
sequence that is nearly identical to E. coli sequence) yielded and E value of 0.007, 
indicating a 670-fold higher probability of a match than the first non-beta match 
(E=4.7), that was not even from an eubacterium. Alignment of the Tth amino acid 
beta sequence (SEQ ID NO:232) was compared with E. coli (SEQ ID NO:233) and S. 

25 pneumoniae (SEQ ID NO:234) P sequences. This alignment is shown in Figure 20C. 
As indicated, this alignment shows that the best alignment was obtained with the 
amino-terminus of both (E. coli and 5. pneumoniae) P sequences. Matches with 
sequence were approximately equally good between all three divergent sequences, 
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■ . indicating that Tth is approximately as close to both E. coli and S. pneumoniae betas 
as they are to one another. Thus, there is strong evidence that the sequence identified 
in these experiments is the amino : terminus of Tth DnaN. 

Full-length sequence of Tth dndN is obtained by sequencing the isolated Dral- 
5 EeoW fragment, as well as the isolated 900 bp Dral-BamHl fragment described above, 
if needed, until full and accurate sequence is obtained. The Dral-ExoRl and Dral- 
BamUl fragments or subfragments of these fragments are used for cloning and to 
enable sequencing from both sides, if difficulties in sequencing arise (e.g., due to 
secondary structure barriers). 

10 Upon obtaining full-length sequence, PCR is used to construct a vector. In this 

. process a primer containing a non-complementary Clal site immediately preceding the 
region of complementarity starting at the initiating ATG is used. A 3' terminal primer 
containing an Spel site (non-complementary) immediately after the complementary 
portion of the oligonucleotide that will end at the stop codon found after the last codon 

15 for Tth dndN is also used. Upon obtaining the terminal sequences with certainty, this 
procedure is conducted (i.e., it is not contemplated that the internal sequences will be 
necessary in these steps). The resulting fragment is cleaved with Clal and Spel, and 
cloned into the corresponding sites of vector pAl-CB-Cla-1 (See t Example 8), and the 
sequence is verified by DNA sequencing as known in the art. A C-terminal 
- 20 biotin/hexahis tagged Tth p expressing vector is constructed by appropriate 

modifications of the procedures described for making C-terminal tagged Tth x in 
Example 8. 

It is contemplated that in the situation where an internal Spel site is identified 
within in the Tth dndN gene, an alternative approach is used. In this alternative 
7.25 approach, axtiXbdl or DrciM* site also present in the vector is substituted. TfTan 

another alternative, where a Clal site is present within Tth dnaN, precluding its use for 
cloning the initiating ATG in front of the vector ribosome binding site, vectors 
presently available that have the Clal site substituted with Ndel, Swal or Nsil are used. 
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Expressed C-terminal tagged p is purified first, using ammonium sulfate 
precipitation optimization and Ni ++ -NTA agarose chromatography, and if necessary, 
soft-release avidin chromatography as described for Tth DnaE (a) in Example 5. 
Polyclonal and/or monoclonal antibodies directed against p (i.e., as the antigen) are 
5 prepared as described in Examples 5 and 8. These antibodies are then used to 
optimize purification of wild-type Tth p. 

An optimized Tth P ammonium sulfate pellet (optimized and obtained as 
described in Examples 5 and 8), is further purified by chromatography on Q- 
Sepharose, S or SP Sepharose, hydrophobic chromatography, and gel filtration 

10 chromatography as described in Examples 5-8. Ionic strength and pH of column 

buffers are optimized, in order to permit sticking of the P protein to the column resins, 
as well as elutions near the middle of the gradient which produce good yields. 

The purified P is then used in conjunction with Tth DNA polymerase III (a or 
an a and e complex) to improve the processivity of DNA polymerase III in PCR or 

15. . other applications involving DNA synthesis. In this simple assay format, p levels will 
be in 100-fold stoichiometric excess of DNA polymerase III, and reactions conducted 
using linear templates. In yet other applications, p is combined with DNA polymerase 
III and Tth DnaX complex and SSB, on primed templates in the presence of ATP (1 
mM) and 10 mM MgCl 2 . This enables efficient assembly of the polymerase on DNA 

20 with p. Elongation proceeds with the addition of dNTPs to 1 mM concentrations. 
Reactions are conducted at the optimal temperature of the reconstituted reaction 
(approximately 60°C) or higher, p allows Tth DNA polymerase to replicate denatured 
and single-stranded DNA molecules of up to approximately 1 megabase. In these 
reactions, the DNA template is varied from 2.5 to 250 fmoles. Other reaction 

25 components and solution conditions used are those described for the gap- fitting assay 
in Example 5, with the exception being that the pH may be varied to the optimal point 
{e.g., between pH 6 and 9), and potassium glutamate may be added to an optimal level 
(e.g., 50-200 mM). Primers are varied from a 1:1 ratio relative to template, to a 
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10,000-fold excess over template, depending upon the level of amplification desired. 
Other components are also included, as described in other Examples. 

EXAMPLE 10 

5 Isolation of T. thermophilus SSB 

and Expression of Purification of SSB 

71 thermophilus SSB is isolated by chromatography of DNA cellulose columns 
using modifications of the methods of Molineux et al (Molineux et ol. t J. Biol. 
Chem., 249 6090-6098 [1974]). Lysates (Fraction I) are prepared by the, methods 

10 described in Example 2. Fraction II is prepared by addition of 0.24 g to each ml of 
Fraction I and precipitates collected as described in Example 2. The collected 
ammonium sulfate precipitate is dissolved in 20 mM Tris-HCl (pH 7.5), 50 mM NaCl, 
10% glycerol, 5 mM p-mercaptoethanol at a concentration of 1 mg protein/ml and 
applied to a denatured DNA cellulose column equilibrated in the same buffer. A 1.5 x 

15 13 cm column is run for each 100 g of T thermophilus used as starting material. The 
column is then washed successively with two column volumes of equilibration buffer 
at 0.5 column volume/h containing 100, 200, 400, 800, 1600 and 2000 mM NaCI 
respectively. Fractions eluting from the column are monitored by SDS-PAGE, and the 
tightest binding fractions that contain proteins between 15,000 Da and 30,000 Da are 

20 pooled individually for each protein and subjected to further purification by Q- 

Sepharose (Pharmacia) chromatography. The pooled fractions are dialyzed against 20 
mM Tris-HCl (pH 7.5), 10% glycerol, 5 mM p-mercaptoethanol, and then applied to a 
Q Sepharose column equilibrated in the same buffer at a load ratio of 2 mg protein/ml 
resin. The column is eluted with a 20-column gradient from 0-1 M NaCl in the 

25 equilibration buffer and the fractions containing the sought protein selected by SDS- 
PAGE in the preceding step pooled. This procedure is repeated for each major 
candidate. 
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Each of the pooled candidate fractions are concentrated by ammonium sulfate 
precipitation (60% saturation) and collected as described under Example 2. The 
protein is as concentrated as possible (e.g., consistent with its being soluble after 
centrifugation in a bench top microcentrifuge for 2 minutes as judged by protein 
5 determination with the Bradford reagent as described in the above Examples), dialyzed 
against the Q-Sepharose equilibration buffer, and then subjected to SDS-PAGE, and 
blotted onto a membrane. 

Both N-terminal and internal peptide sequencing are conducted as described 
above for DnaX and DnaE. For the protein that contains sequences that show the best 

10 homology to SSB from other eubacteria (i.e.,. as judged by the most favorable score 
using the NIH's BLAST server), oligonucleotides are designed for obtaining a 
fragment of the T. thermophilus ssb gene using methods described for dnaX, E, Q and 
A. Success in obtaining a fragment of the sought ssb gene is judged by a favorable 
score of a sequence obtained by DNA sequencing the cloned PGR products (/.e., better 

1 5 than 1 x 1 0 2 ) in a BLAST search of the sequences between primers against the nr 
database at NIH's server. This fragment is used to probe a lambda library for a full 
length clone as described for DnaE. 

The candidate full length fragment is subjected to DNA sequencing and its 
identity confirmed by recognizable homology to other eubacterial ssbs using the 

20 default parameters in the NIH BLAST server. The full length gene is modified and 
overexpressed using the strategies outlined for DnaX above. The isolated 
homogeneous protein {e.g., either from purification from an overproducing strain or 
frame non-overproducing T. thermophilus) is then used to support reconstituted T. 
thermophilus replication systems. 

25 
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EXAMPLE 11 

Use of T. thermophilus DnaX Proteins to Obtain Additional Components 

of the DnaX Complex and Rcconstitution of T. thermophilus DnaX Complex 

In other cellular systems examined to date, the ATPase that transfers the sliding 
5 clamp processivity factor (i.e.. p 2 in E. coli, PCNA in eukaryotes) contains five 
different proteins that are tightly and cooperatively bound in a complex. In this 
Example, T. thermophilus DnaX protein is immobilized on a column and cleared T 
thermophilus lysates are passed over the column, in order to permit subunit exchange 
and assembly of a full DnaX complex on the immobilized protein. Contaminants are 

.10 washed away, and the specifically bound proteins eluted, separated by SDS-PAGE, 
transferred to a membrane, and both amino-terminal and internal peptide sequences 
determined, using methods known in the art. These sequences are used to isolate the 
structural gene for the isolated proteins. These proteins are expressed and purified, 
their ability to form a specific complex with T. thermophilus DnaX confirmed, and 

15 used to reconstitute T thermophilus DnaX complex to provide a functional T 
thermophilus DNA polymerase III holoenzyme. 

Alternatively, if the Tth complex is so stable that significant subunit exchange 
does not occur to permit success with the procedure described in the preceding 
paragraph, the biotin/hexahis vector is modified by moving the modified Tth dnaX 

20 gene from pAl-CB-TX to a Tth expression vector, transfecting the vector into T 

thermophilus and express the tagged protein in T thermophilus. It is contemplated 
that this will allow assembly of the tagged Tth DnaX protein with the other 
components of the Tth DnaX complex as they are synthesized. 

The in vivo assembled complex is isolated by Ni ++ -NTA chromatography as 

25 described herein. The associated proteins are then characterized, and used to isolate 
the structural gene, as well as construct expression systems as described above. If the 
tagged DnaX protein synthesized in T thermophilus is biotinylated, as revealed by 
biotin blots as done for biotinylated Tth x expressed in E. coli, soft release avidin, 
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avidin or streptavidin chromatography are used to isolate the complex. It is 
contemplated that if necessary, the Tth complex will be biotinylated by treatment with 
the E, coli biotinylation enzyme. 

If both of the above methods are determined to be unsatisfactory, the DnaX 
5 accessory proteins are purified biochemically monitoring their position of elution 

during chromatographic procedures, based upon their requirement to be added back to 
purified DnaE, DnaQ, DnaX and DnaN proteins (a, e, x, y and p, respectively), and, 
if necessary SSB, to reconstitute replication on a long single-stranded DNA template 
containing a single oligonucleotide primer. 

10 Also, an assay is set up that employs a long single stranded template and a 

single primer. This assay system is used to assay for an activity in T. thermophilics 
extracts (or ammonium sulfate or column fractions, as appropriate) that stimulates 
replication by limiting levels of DNA polymerase III (a and e) in the presence of 
DnaN ((V) and, if necessary, SSB. This assay is used to guide purification of T. 

15 thermophilus DnaX auxiliary factors. The isolated proteins are partially sequenced 
(e.g., N-terminus and internal sequences), a fragment isolated by PCR, the entire full 
length gene isolated from chromosomal libraries in lambda vectors as described for 
DnaE, the corresponding structural genes sequenced, expressed and the resulting 
proteins purified by ammonium sulfate fractionation and chromatographic procedures. 

20 Then, conditions for reconstitution of DNA polymerase III holoenzyme activity on 
. long single-stranded templates are optimized. 

EXAMPLE 12 
Additional Assay Systems 
25 In this Example, methods that assay specifically for processive DNA replication 

on long single-stranded templates are described. These assays are useful, as they 
permit further optimization of the systems described above to maximize processive 
synthesis. These steps include the assaying cleared lysates, ammonium sulfate and 
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chromatographic fractions for factors that further increase the yield of long replication 
products. Once identified, these factors are purified, their structural genes isolated and 
expression vectors constructed to provide larger quantities of these stimulatory factors. 

In addition, the prpcessive assays are modified to include regions, of high 
5 secondary structure at a defined point in the template. In a second assay, long single- 
stranded DNA that contains extensive complementarity to internal regions ranging 
from 100-1000 bases are added. Conditions that increase readthrough of these regions 
including the assaying for T. thermophilics protein factors that enable this readthrough 

Furthermore, the initial optimizations is conducted on lambda clones containing 
10 inserts up to 23 Kb. Primers are selected from insert sequence or flanking lambda 
sequences. Optimum protocols to amplify DNA to provide optimaLyields are 
developed. 

In other assays, construct templates with extensive single-stranded regions in 
. front of the primer are constructed. The duplex region is either, blunt or contains 
15 extensive S'-noncomplementary tails. Among the factors that stimulate these assays, it 
is contemplated that helicases and helicase-associated factors are required for the 
assembly of the helicases onto DNA or for their maximal activity once associated. 
Helicases that can specifically interact with the DNA polymerase III holoenzyme are 
favored in this type of approach, since they provide the most striking stimulation of 
20 assays using the DNA polymerase III holoenzyme as a polymerase. - 

It is also contemplated that DNA polymerase III holoenzyme-associating 
helicases will be isolated using some of the protein affinity column approaches 
described herein (/.e , DnaE or DnaX and associating components (or both) 
immobilized on a streptavidin or avidin column). 

25 ; It is contemplated that other routes toward stimulating helicases wiTT be more 

directed. For example, it is contemplated that searches be conducted for Tth homologs 
of DnaB, uvrD and rep (i.e., helicases that have been found to be associated with DNA 
polymerase III holoenzyme from E. coli and/or work in conjunction with it for the 
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processive replication of duplex DNA). These helicases are isolated by alignment with 
homologs, selection of conserved regions and synthesis of degererate PCR primers, 
isolation of a portion of the gene by PCR, and isolation of the full length gene from 
77/?-lambda libraries as described herein for dnaA and dnaQ. 

5 Once these helicases are in hand, it is contemplated that they be added to 

assays, to identify for additional assembly factors required for their function. It is 
also contemplated that this type of procedure will function in the isothermal 
amplification of DNA without the requirement for thermal cycling. It is further 
contemplated that it will be advangageous to add polymerases with more potent strand 

10 displacement activity, even if their overall processivity and ability to function on long 
templates is limited, to assist the polymerase through regions of difficult secondary 
structure. 

Once optimization has been performed on lambda isolates and on duplex 
containing duplex regions in front of the primer, the procedure is extended to 100 Kb 

1 5 or longer isolates using longer vectors. The optimal conditions are determined for 
these vectors, and the methodology is extended to isolated chromosomal DNA to 
determine the practical limitations of these methods. 

Optimal elongation protocols are developed, and cycling protocols that enable - 
reconstituted T. thermophilics DNA polymerase III holoenzyme plus isolated accessory 

20 factors used in PCR methods using substrates {i.e., targets) beyond 10 kb, 50 kb, and 
200 kb, and larger, are also established. Conditions that permit the component 
proteins to remain stable during repeated denaturation steps to approximately 95°C, in 
standard protocols as known in the art are also established. In the alternative, capillary 
technology requiring very short denaturation times (i.e., a few seconds at 

25 approximately 95 °C) are developed. In addition, methods and conditions*fbr 
isothermal amplification, wherein the polymerase is coupled to the action of 
thermophilic helicase and associated assembly factors are also obtained. 
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From the above, it is clear that the present invention provides novel 
thermophilic DNA polymerases and preparations from T. ihermophilus. In particular, 
the present invention provides T. ihermophilus polymerase III preparations and means 
to identify polymerase Ills present in other species. 
5 All publications and patents mentioned in the above specification are herein 

incorporated by reference. Various modifications and variations of the described 
method and system of the invention will be apparent to those skilled in the art without 
departing from the scope and spirit of the invention. Although the invention has been 
described in connection with specific preferred embodiments, it should be understood 
10 that the invention as claimed should not be unduly limited to such specific 

embodiments. Indeed, various modifications of the described modes for carrying out 
the invention which are obvious to those skilled in the relevant arts are intended to be 
within the scope of the following claims. 
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1. A DNA polymerase III holoenzyme isolated from a thermophilic 
5 organism. 

The DNA polymerase III holoenzyme wherein said thermophilic 
selected from a member of the genera Thermus, Thermotoga, and Aquiflex. 

10 3. The isolated nucleotide sequence set forth in SEQ ID NO: 196. 

4". - The isolated nucleotide sequence of Claim 3, wherein said nucleotide 
sequence further comprises 5' and 3' flanking regions. 

15 5. A recombinant DNA vector comprising at least a portion of the 

nucleotide of Claim 3. 

6. A host cell containing the recombinant DNA vector of Claim 5. 

20 7. At least a portion of purified DnaE protein encoded by an 

oligonucleotide comprising at least a portion of nucleotide sequence substantially . 
homologous to the coding strand of the nucleotide sequence of Claim 3. 

8. The purified protein of Claim 7, wherein said DnaE protein is from 
25 Thermus thermophilics. 
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9. The purified protein of Claim 7, wherein said DnaE protein comprises 
the amino acid sequence set forth in SEQ ID NO: 197. 

10. A fusion protein comprising at least a portion of the DnaE protein of 
5 Claim 7, and a non-dnaE protein sequence. 

11. The fusion protein of Claim 10, wherein said DnaE protein comprises 
SEQ ID NO: 197. 

10 12. An isolated nucleotide sequence as set forth in SEQ ID NO:214. 

13. . The nucleotide sequence of Claim 12, wherein said nucleotide sequence 
further comprises 5' and 3' flanking regions: " • : 

15 14. A recombinant DNA vector comprising at least a portion of the 

nucleotide of Claim 12. 

15. A host cell containing the recombinant DNA vector of Claim 14. 

20 16. At least a portion of purified DnaQ protein encoded by an 

oligonucleotide comprising at least a portion of nucleotide sequence substantially 
homologous to the coding strand of the nucleotide sequence of Claim 12. 

17. The purified protein of Claim 16, wherein said DnaQ protein is from 
25 Thermus thermophilus. 
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18. The purified protein of Claim 16, wherein said DnaQ protein comprises 
an amino acid sequence selected from the group consisting of SEQ ID NOS:215 and 
216. 

5 19. A fusion protein comprising at least a portion of the DnaQ protein of 

Claim 16, and a non-dnaQ protein sequence. 

20. The fusion protein of Claim 19, wherein said DnaQ protein comprises 
an amino acid sequence selected from the group consisting of SEQ ID NO:215 and 

10 SEQ ID NO:216. 

21. The isolated nucleotide sequence set forth in SEQ ID NO:221. 

22. The isolated nucleotide sequence of Claim 21, wherein said nucleotide 
15 sequence further comprises 5' and 3' flanking regions. 

23. A recombinant DNA vector comprising at least a portion of the 
nucleotide of Claim 21. 

20 24. A host cell containing the recombinant DNA vector of Claim 23. 

25. At least a portion of purified DnaA protein encoded by an 
oligonucleotide comprising at least a portion of nucleotide sequence substantially 
homologous to the coding strand of the nucleotide sequence of Claim 2 1 . 

25 

26. The purified protein of Claim 25, wherein said DnaA protein is from 
Thermus thermophilus. 
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27. The purified protein of Claim 25, wherein said DnaA protein comprises 
the amino acid sequence set forth in SEQ ID NO:222. 

28. A fusion protein comprising at least a portion of the DnaA protein of 
5 Claim 25, and a non-DnaA protein sequence. 

29. The fusion protein of Claim 28, wherein said DnaA protein comprises 
SEQ ID NO:222. 

10 30. The isolated nucleotide sequence set forth in SEQ ID NO:230. 

31. The nucleotide sequence of Claim 30, wherein said nucleotide sequence 
further comprises 5* and 3 ? flanking regions. 

15 32. A recombinant DNA vector comprising at least a portion of the 

nucleotide of Claim 30. 

33. A host cell containing the recombinant DNA vector of Claim 32. 

20 34. At least a portion of purified DnaN protein encoded by an 

oligonucleotide comprising at least a portion of nucleotide sequence substantially 
homologous to the coding strand of the nucleotide sequence of Claim 30. 

35. The purified protein of Claim 34, wherein said DnaN protein is from 
25 Thermus thermophilics. 
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36. The' purified protein of Claim 34, wherein said DnaN protein comprises 
the amino acid sequence set forth in SEQ ID NO:231. 

37. A fusion protein comprising at least a portion of the DnaN protein of 
5 Claim 34, and a non-DnaN protein sequence. 

38. The fusion protein of Claim 37, wherein said DnaA protein comprises 
SEQIDNO:23L 

10 39. The isolated nucleotide sequence set forth in SEQ ID NO:7. 

40. The isolated nucleotide sequence of Claim 39, wherein said nucleotide 
sequence further comprises 5' and 3' flanking regions. 

15 41. A recombinant DNA vector comprising at least a portion of the 

nucleotide of Claim 39. 

. 42. A host cell containing the recombinant DNA vector of Claim 41. 

20 43. At least a portion of purified dnaX protein encoded by an 

oligonucleotide comprising at least a portion of nucleotide sequence substantially 
homologous to the coding strand of the nucleotide sequence of Claim 39. 

44. The purified protein of Claim 43, wherein said dnaX protein is from 
25 Thermits thermophilus. 
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45. The purified protein of Claim 43, wherein said driaX protein comprises 
the amino acid sequence set forth in SEQ ID NO:9. 

46: : A fusion protein comprising at least a portion of the dnaX protein of 
5 Claim 35, and a non-dnaX protein sequence. 

47. The fusion protein of Claim 46, wherein said dnaX protein comprises 
SEQIDNO:9. 

10 48. An isolated amino acid sequence as set forth in SEQ ID NO;2. 

49. An isolated nucleotide sequence encoding at least a portion of the amino 
acid sequence encoding the amino 'acid sequence of Claim 48. ' r .. 

15 50. The nucleotide sequence of Claim 49, wherein said nucleotide sequence 

further comprises 5\ and 3 V flanking regions. ; 

51; A recombinant vector comprising at least a portion of the nucleotide 
sequence of Claim 49. 

20 " '"' ; . - ; ■ ' "■■ * .■ ■ 

52. A host cell containing the recombinant DN A vector of Claim 51. 

'53. ' At least a portion of purified tau protein encoded by a polynucleotide 
sequence substantially homologous to the coding strand of the nucleotide sequences of 
25 Claim 48. 
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54. The purified protein of Claim 53, wherein said tau protein is from 
Thermus thermophilics. 

55. A fusion protein comprising at least a portion of the tau protein of 
5 Claim 53, and a non-tau protein sequence. 

56. An isolated amino acid sequence as set forth in SEQ ID NO:l. 

57. An isolated nucleotide sequence encoding at least a portion of the amino 
10 acid sequence of Claim 56. 

59. At least a portion of purified gamma protein encoded by a 
polynucleotide sequence substantially homologous to the coding strand of the 
nucleotide sequences of Claim 57. 

15 

60. The purified protein of Claim 59, wherein said gamma protein is from 
Thermus thermophilus. 

61. The nucleotide sequence of Claim 57, wherein said nucleotide sequence 
20 further comprises 5' and 3' flanking regions. 

62. A recombinant vector comprising at least a portion of the nucleotide 
sequence of Claim 57. 

25 . 63. A host cell containing the recombinant DNA vector of Claim 62. 
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64. A fusion protein comprising at least a portion of the gamma protein of 
Claim 57, and a non-gamma protein sequence. 



65. A method for detecting DNA polymerase III holoenzyme comprising: 
5 a) providing in any order: 

i) a sample suspected of containing at least a portion of 
DNA polymerase III holoenzyme; 

ii) an antibody capable of specifically binding to a 
at least a portion of said DNA 

10 polymerase III holoenzyme; . 

b) mixing said sample and said antibody under conditions wherein 
said antibody can bind to said DNA polymerase III holoenzyme; 

; .... ; ^ .'. , . and . . . , , . 

c) detecting said binding. 

15 

66. The method of Claim 65, wherein said sample comprises a thermophilic 
organism. 

67; The method of Claim 66, wherein said thermophilic organism is 
20 member of the genus Thermits. 

68. An antibody, wherein said antibody is capable of specifically binding to 
at least one antigenic determinant on the protein encoded by an amino acid sequence 
. ": selected from the group comprising SEQ ID NOS:l, 2, 8, 9, 26; 31, 34, 3T, 59, 70, ; 
25 75, 82, 85, 197, 198, 215, 216, 222, and 231. . 
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69. The antibody of Claim 68, wherein said antibody is selected from the 
group consisting of polyclonal antibodies and monoclonal antibodies. 

70. A method for identification of polymerase subunits comprising the steps 

5 of: 

a) providing a test sample suspected of containing amplifiable 
nucleic acid encoding at least a portion of a DNA polymerase III subunit 
protein; 

b) isolating said amplifiable nucleic acid from said test sample; 
10 c) combining said amplifiable nucleic acid with amplification 

reagents, and at least two primers selected from the group consisting of primers 
having the nucleic acid sequence set forth in SEQ ID NOS:21, 22, 86, 87, 91, 
97, 110, 111, 112, 113, 114, 115, 116, 134, 135, 136, 137, 156, 157, 173, 174, 
176, 177, 208, 209, 210, 211, 212, and 213, to form a reaction mixture; and 
15 d) combining said reaction mixture with an amplification enzyme 

under conditions wherein said amplifiable nucleic acid is amplified to form 
amplification product. 

71. The method of Claim 70, further comprising the step of detecting said 
20 amplification product. 

72. The method of Claim 71, in which said detecting is accomplished by 
hybridization of said amplification product with a probe. 

25 73. The method of Claim 70, wherein said primers are capable of 

hybridizing to nucleic acid encoding at least one polymerase subunit selected from the 
group consisting of DnaA, DnaN, DnaE, DnaX, DnaQ, and DnaB. 
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74. The method of Glaim 70, wherein said test sample comprises nucleic 
acid obtained from a thermophilic organism. 

75. The method of Claim 74, wherein said thermophilic organism is a 
5 member of the genus Thermus. 

76. An amplification method comprising the steps of: 

a) providing 

i) a test sample suspected of containing amplifiable nucleic 
10 * ' ■ acid, * 

ii) amplification reagent, 

iii) DNA polymerase; 

iv) an adjunct component comprising at least a subunit of a 
thermostable polymerase selected from the group 

15 consisting of DnaX, DnaE, DnaQ, DnaN, and DnaA, and 

v) at least two primers; 

b) isolating said amplifiable nucleic acid from said test sample; 

c) combining said amplifiable nucleic acid with said amplification 
reagent, said DNA polymerase, said adjunct component, and said at least two 

20 primers under conditions such that amplifiable nucleic acid is amplified to form 

amplification product 

77. The method of Claim 76, further comprising the step of detecting said 
amplification product. \ V 



25 



78. The method of Claim 77, in which said detecting is accomplished by 
hybridization of said amplification product with a probe. 
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79. The method of Claim 76, wherein said primers are capable of 
hybridizing to nucleic acid encoding at least one polymerase subunit selected from the 
group consisting of DnaA, DnaN, DnaE, DnaX, DnaQ, and DnaB. 

5 80. The method of Claim 76, wherein said DNA polymerase is selected 

from the group consisting of Taq polymerase, E. coli DNA polymerase I, Klenow, Pfu 
polymerase, Tth polymerase, Tru polymerase, Tfl polymerase, Thermococcus DNA 
polymerase, and Thermotoga DNA polymerase. 

10 81. The method of Claim 76, wherein said amplification reagent comprises 

components selected from the group consisting of single-stranded binding proteins, 
helicases, and accessory factors. 
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Organism 


Start 
Position 


Sequence 


SEQ ID NO: 


E. coll 


: 3 


LARKWRPQTFADWGQEH 


SEQ ID NO:224 






•L R++RP TF +WGQEH 


SEQIDNO:5 


Tth gamma 


1 


SALYRRFRPLTFQEVVGQE 


SEQ IDNO:l 


Tth tau 


1 ■ 


SALYRRFRPLTFQEVVGQEH 


SEQ ID NO:2 




ALYR FRP F++WGQEH 


SEQ ID NO:6 


B. subtllls 


5 


ALYRVFRPQRFEDWGQEH 


SEQ ID NO:4 
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A FIGURE 9 

TCCGCCAGGCGCTITCCCGTGAGGTAGAGGGCCAtCTCCACCACCTCGAGClGCTCC 

GGGGGCGTCACCCGAAGCCCCCCCACGAAGCGGCTCTCGTACCCCAGGCGCCCAAGC 

G AAGCCCCG ATCTCGGGCCCGCCCCCGTGGACGAGAACC AAGGG ACC AGGGT AGGC 

GGCAAGCTCGTCCAATAGGGCC7CCGrCCCCCr/i/lG<^7TCCrCCC/lCC7TC/lCC>4/J/}/! 

GGGCCTCA CTCAA GGTAGA CCACCTCCTCGGGCA GGCCCTCCTCC TCCTCGGCCTCCA G 

GAGGTCGGCGAA GCCCA GAA GGGCGTGGTCCGGGTGGA GGCCGAA GCGGGCGGCGA G 

GGCCCGGACCGCGTGCTCGGGCGGGAGGGCCTGGGCCCTCGTCAGGAGGGGGGCGAG 

GTCGGCCGGGGCCTTCAGGGCCGTCCCAAGCTCCGGACCCGCGCGAAGCTCCGCCCTC 

A CC TCCCCCCCCTCGGCCA GGAGGAA GA GGAGGTCCTCCTCGCTTA G TA CCA GGAA GGC 

CAA GGCCTGCCCTAGGCGGGAAA GCTCCCGCA GGAA GGCCCGGA TGGCCTCCGGG TCC 

TCCTCCGCCTCGAGGGCCTCGfGGTAGAGGCTCGTCCAAGGGGGGTCGGGGACCAGGT 

AGACCCCCGTCCCTCCCGTGCGCCCCTCGGCCCAGGCCGCCACCGCCTCCAGGGGGGC 

CTGCA CG TGCAA GGA GA GGAA GCTCCGCACCA CGCCCTA TA CTA GCCC77£IG_AGCGCC 

CTCTACCGCCGCTTCCGCCCCCTCXCCTTCCAGGAGGTGGTGGGGCAGGAGCACGTG " 

AAGGAGCCCGTCCTGAAGGCCATGCGGGAGGGGAGGCTGGCCCAGGCCTACCTCTTC 

TCCGGGGCCAGGGGCGTGGGCAAGACCACCACGGCGAGGCTCCTCGCCATGGCGGT 

GC^jGTGCCAGGGGGAAGACCCGCCTTGCGGGGTCTGGCCCCACTGCCaGGCGGTGCA 

GAGGGGCGCCCACCCGGACGTGGTGGAGATTGACGCCGCGAGCAACAACTCCGTGG 

AGGACGTGCGGGAGCTGAGGGAAAGGATCCACCTCGGCCCCCTTTCTGCCCCCAGGA 

AGGTCTTCATCCTGGACGAGGGCCACATGCTCTCCAAAAGCGCCTTCAACGCCCTCCT 

CAAGACCCTGGAGGAGCCCCCGCCCCACGTCCTCTTCGTCTTCGCCACCACCGAGCC 

cgagaggatgccccccaccatcctctcccgcacccagcacttccgcttccgccgcctc 

acggaggaggagatcgcctttaagctccggcgcatcctggaggccgtggggcggga 

ggcggaggaggaggccctcctcctcctcgccggcctggcggacggggcccttaggga 

cggggaaagcctcctggagcgcttcctcctcctggaaggccccctcacccggaagga 

ggtggagcgcgccctaggcctcccccccagggaggccctggccgagatcgccgcctc 

cctcgcgagggggaaaacggcggaggccctgggcctcggccggcgcctctacgggg 

aagggtacgccccgaggagcctggtctcgggccttttggaggtgttccgggaaggcc 

tctacgccgccttcggcctcgcgggaaccccccttcccgccccgccccaggccctgat 

cggcgccatgaccgccctggacgaggccatggagcgcctcgcccgccgctccgacgc 

cttaaggctggaggtggccctcctggaggcgggaaggggcctggccgccgaggccct 

GCGCCAGCCC ACGGGCGCTCCCCGGGCAGAGGTCGGCCGCAAGCGGG AAAGCCCCCC 

GGCCCCGGAACCCCCAAGGCCCGAGGAGGCGCCCGACCTGCGGGAGCGGTGGCGGG . 

CCTTCCTCGAGGCCCTCAGGCCCACCCTACGGGCCTTCGTGCGGGAGGCCCGCCCGG 

AGGTCCGGGAAGGCCAGCTCTGCCTCGCTTTCCCCGAGGACAAGGCCTTCCACTACC 

GCAAGGCCTCGGAACaGAAGGCGAGGCTCCTCCCCCTGGCCCAGGCCCATTTCGGGG 

TGGAGGAGGTCGTTCTCGTCCTGGAGGGAGAA AAAAAA AGCCTGAGCCCAAGGCCC 

cgctcggccccacctcctgaagcgcccgeacccccgggccctcccgaggaggagg'i a 

gaggcggaggaagcggcggaggaggccccggaggaggccttgaggcgggtggtcc 

gcctcctgggggggcgggtgctctgggtgcggcggcccaggacccgggaggcgccg 

gaggaggaacccctgagccaagacgagatagggggtactggtatax44tgggggca 

tgacgcggaccagcgacctcggacaagagaccgtggagaacatcctcaagcgcctcc 

GCCGTATTGAGGGCCAGG rGCGGGGGCTCCAAAAGATGGTGGCCGAGGGCCGCCCC 

TGCGACGAGGTCCTCACCCAGATGACCGCCACCAAGAAGGCCATGGAGGCGGCGGC 

CACGCTGATCCTCCACGAGTTCCrGAACGTCTGCGCCGCCGAGGTCTCCGAGGGCAA 

GGTGAACCCCAAGAAGCCCGAGGAGATCGCCACCATGCTGAAGAAGTTCATCTAGAT 

GGGTCGGGTTCGGGGGCGCCTCCGGCGCCTCCTCCGGGCCCTTCTCGCCCAGGAGGC 



9913060A1_I_> 



f 



WO 99/13060 PCT/US98/18946 

10/35 



FIGURE 9 (CONT.) 



B 

SALYimFRPLTFOEVVGQEH VXEPLLKAlREGIl^ 

VGCQGEDPPCGVCPHCQAVQRGAHTO 

FILDEAHN^SKSAJ^ALLKTLEEPPPHVLFVFATTE 

FKJLRJRJDLEAVGREAEEEALLLLAR^ 

EALAEIAASLARGKTAEALGLARRI.YGEGYAPR 

PPQAJLIAAMTAJLDEAMERJLAJUISDALSLEVAJLLEA 

SPPAPEPPRPEEAPDLRERWRAFLEALRPTLRAFV^ 

KASEQK AR1XPLAQABUFGVEEVVLVLEG 

EAAEEAPEEALRKVVRJLLGGR\O.WVRRPRTREAPEEEPLSQDEI^ 
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GAAAAAAAAAGCCTGAGCCCAAGGCCCCGCTCGGCCCCACCTCCTGA 
E K K S . L' S P R P R S A P P . P 
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+ 1 E K K A 
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FIGURE 10 



A 



AAAAGCTTCCTGGGGGCANAAAGGG(^GCGAGGTGGATCCTTTCCCTCAGCTCCCGCACNTrrTrrArr; 

gS^ 



B 



213 DPPCGVCPHC^VQRGAHPDWEIDAASNXSXX^^ 34 

D PC C C+ + G+ DV+EIDAASN ♦ AP KV +OE H 

70 DEPCNECAACKG ITNGS I SDVI E I DAASNWGVDE I RD I RDK VKFAPSAVTYKVY 1 1 DEVH 129 

33 MXSKSAFNALL 1 T. thermophxlus 

M S AFNAIX 
130 MLSIGAFNALL 340 B. Bubtllis 
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FIGURE 11 A 



Organism 


Start 
Position 


Sequence 


SEQ ID NO: 


M. tuberculosis 


10 


FVHLHNHTE 


SEQ ID NO:27 






F HLH HT+ 


SEQ ID NO:29 


T. thermophilus 


4 


RFAHLHQHTQ 


SEQ ID NO:26 






RF HL HT 


SEQ ID NO:30 


H. influenzae 


6 


RFIHLRTHTD 


SEQ ID NO:28 
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FIGURE 1 IB 



Organism 


Start 
Position 


Sequence 


SEQIDNO: 


7! thermophilics 


91 


XFTGYQNLVRLASRAYLEGFYEK 


SEQ ID NO:31 




TGYQNL L S+AY Q+ 


SEQ ID NO:33 


E. coli 


: 92 f 


TGYQNLTLL I S KAYQRGY 


SEQ ID NO:32 




T. thermophilus 


676 


g . 

w . . as g 
p ILDETYGI PVYQEQqMQiAaavaxY 


SEQ ID NO:225 


SEQIDNO:226 


SEQ ID NO:34 




+L+ TYGI +YQEQ MQIA 


SEQIDNO:36 


E. coli 


676 


VLEPTYGI ILYQEQVMQIA 


SEQ ID NO:35 




T. thermophilus 


853 


a 

GLDGGYFHLTlfd 


SEQ ID NO:227 


SEQ ID NO:37 


+GGYF+ 1FD 


SEQIDNO:39 


E. coli 


853 


RNKGGYFR - ELFD 


SEQIDNG:38 
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FIGURE 12 



(A) 

AGGAGACGACCCCCCGAGGACCCCGCCTTGGCCATGACCGA^ 

taccSgwct^cgcc^gcgagccgggcttacctg b 



(B) 

Synechocy 12 YSLLDGASQLPALI 2 5 

+SLLDGA++L L+ 
Tth 2 FSLLDGAAKLSXLLXWVK 55 

+S+LDGAAK++ +L V+ 
M. tuber 19 YSMLDGAAKITPMLAEVE 36 

Synechocy 34 PAI ALTDHGVMYGAVELLKVCRGKPIKPIIGNEMYV 69 

75 PaIamSSgNFFGAXXFYKKATEMGIKPILGYEAXVAAESRFD^ 212 
du MTDHGN FGA FY AT+ GIKPI+G EA +A SRFD +R 
M. tuber «1 PAvgSSgNMFGASEFYNSATKAGIKPIIGVEAYIAPGSRFDTRR 8 

Synechocy 83 FHQWLAKNNQGYRNLVKLTT 103 

byn y . FH AK+ GY+NLV+L + 

Tt-h 228 GGYFHXTXXAKDFTGYQNLVRLASRA 305 

G y H T A++ TG +NL +L+S A 
M tuber 103 GSYTHLTMMAENATGLRNLFKLSSHA. 128 
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FIGURE 13 



(A) 

ATGGGCCGCAAACTCCGCTTCGCCCACCTCCACCAGCACACCCAGTTCTCCCTCCTGGACGGGGCGGCGAARC 
TTTCCRACCTCCTCAAKTGGGTCAAGGAGACGACCCCGCGAGGACCCCGCCTTGGCCATGACCGACCACGGCA 
ACTTCTTCGGGGCCGKGGAKTTCTACAAGAAGGCCACCGAAATGGGCATCAAGCCCATCYTGGGCTACGAGGC 
CTAMGTGGCGGCGGAAAGCCGCTTTGACCGCAAGCGGGGAAAGGGCYTAGACGGGGGCTACTTTCACYTCACC 
YTCYTCGCCAAGGACTTCACGGGGTACCAGAACCTGGTGCGCCTGGCGAGCCGGGCTTACCTGGAGGGGTTTT 
ACGAAAAGCCCCGGATTGACCGGGAGATCAYCCTGCGCGAGCACGCCGAGGGCCTCATCGCCCTCTCGGGGTG 
CCTCGGGGCGGAGATCCCCCAGTTCATCCTCCAGGACCGTCTGGACCTGGCCGAGGCCCGGCTCAACGAGTAC 
CTCTCCATYTTCAAGGACCGCTTCTTCATTGARATCCAGAACCACGGCCTCCCCGAGCAGAAAAAGGTCAACG 
AGGTCCTCAAGGANTTCGCCCGAAANTACGGCCTGGGGATGGTGGCCACCAACGACGGCCATTACGTGAGGAA 
GGAGGACGCCCGGGCCCACGAGGTCCTCCTCGCCATCCAGTCCAAGAGTACGYTGGAGGACCGGGGGCGGTGG 
-CGCTTCCCCTGCGACGAGTTNTACGTGAAGACCCCSGANGAGATGCGGG'CCATGTTCCCCCGAGGAGGAGTGG 
GGGGACGAGCCCTTTGACAACACCGTGGAAGATCGCCCGCATGTGCAACGTGGAGCTGCCCATCGGGGGGACA 
AGATGGTCTACCGCATCCCCCGCTTCCCCYTCCCCGCCCGTCGGAMCGARGCCCAGTTACTTCATGGAGCTCA 
CNTTTAAGGGGCTCCTCCGCCGCTACCCGGAGCGGATCACCGAGGGCTTCTACCGGGAGGTCTTCCGCCTTTT 
GGGGAAGCTTCCCGCCCACGGGGACGGGGAGGCCCTGGCCGAGGCCTTGGCCCAGGTTGGAGCGGGAGGCTTG 
GGAGAAGCTCATTG 



(B) 

RKLRFAHLHQHTQFSLLDGAAKLSDLLNWVKETTPEDPALAMTDHGNFFGAVDF YKKATEMG I KPILG Y EAY V 
AAESRFDRKRGKGLDGG YFHFTLLAKDFTG YQNLVRLASRAYLiEG F Y EKPRI DREI TLREHAEGLI ALSGCLG 
AEIPQFILQDRLDLAEARIjNEYLSIFKDRFFIEIQNHGIiPEQKKVN EVLKDFARKYGLGMVATNDGHYVRKED 
ARAHEVLLAIQSKSTLDDPGRWRFPCDEFYVKTPDEMRAMFPPRRSGGTSPLTTPWKIARMCNVELPIGGTRW 
STASPASPSPPVGPKPSYFMELTFKGLLRRYPDRITEGFYREVFRLLGKLPPHGDGEALAEALAQVGAGGLGE 
A ' 

(C) * 



M. tuber 1 
Tth 1 
Syn . sp. 1 




GNMFG 

gnSfg 



HvSlliB 



M. tuber 54 gs] 

Tth si gS« 

Syn.sp. 4 7 EEaLL 




RTRRIL WGDPSQf|ADj£ gSGSgsg 

B ?Skrg KGLC 



IN 0 ........ Dig gNKRHF 



L 
F 



M . ttfber 109 
Tth 9 5 

Syn.sp. 8 6 




M . tuber 158 
Tth 14 4 

Syn . sp . 139 




YFLBEtjjjDHG 
lF||EIQiHG 
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FIGURE 14 



(A) 



CTGGACCTGGCCGANGCCCGGCTCAACAAGTACCT 



(B) 

CTGCACCGGGATGAGCTTGGCCAATTCCTCCGCCTTCTTGTGGGGGATGCCGTAGACCCGGGCCACGTCCTTG 
AGGGCGGCCTTGGAGGCGAGGCTTCCCAGGGTGCCGATCTGGGCCACCTTGTCCTCGCCGTAGCGTTCCCGCA 
CGTACTGGATCACCCGGTCCCGCTCCCGGTCGGAGAAGTCCGTGTCAATGTCGGGCATGGAGACCCTCTCGGG 
GTTCAGGAAGCGCTCAAAGAGGAGGCCGAAGCGCAGGGGGTCAATGTTGGTGATCCCCACGGCGTAGGCCACC 
AGGCTCCCGGCGGCGCTCCCCCTGCCGGGCCCCACGGAGACGCCGTTTCTCCGGGCCCAGTTGATGTATTCCT 
GGACGATGAGGAAGTAGCCGGGNAAACCCCATGCGCTCTATCACGGGAAAGCTCGTAAAGGGCCCGGTGGAAA 

ATGGCC 



(C) 



Tth 



393 GXPGYFLIVQEYINWARRNGVSVGPGRGSAAGSLVAYAVGITNIDPLRFGLLFERFLNPE 2 
? prm V E+I " ++ NGV VGPGRGS AGSLVAYA+ IT++DPL F LLFERFLNPE 

E.coli 336 gfpgySi^Ifiq^kdngvpvgpgrgsgagslvayalkitdldplefdllferflnpe .3 

Tth 213 R VSM£^ T OFSORERORVi ;r > 

E.coli 396 4 

Tth 33 AEELAKLIP 7-— Tth base/ 

. + ++KLIP . J ' 

E.coli 456 VDRISKLIP 464 E - col.i amino acid number 
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FIGURE 15 



TTG/^GGTGTGGAAAAGCTC CTCCTGGGTC 

ACCAGCAGGAGGTCCACGGAGCGGTACCGCTCCCGG AACTCCGTCATCCGGTCCTCGCGGAT . 
GGCGTrGATGAGCTCGTTGGTGAAAGTTTC 

AnrnnrTnGrCACnnAGTGGCCCACGGCGTGCATCAG GTGGGTCTTCCCCAATCC 
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FIGURE 16 



A: 



TCGAAGGCCGCCnTGTGC GCCACCAGCACCGTGTCCTGGACGAAGGCGCGGAAGGCGGGGAG 

GACCGCCTCTAGGGGAGGCTTGTCCCGGANCATCTCCGCCGTGAGGCCGTGGACCGCCGTGG 

CCGCGGGGGANATGGGGCGCCCCGGGTTCACCAGGGCCTCGAACACCTC 

cttcgacccaggatatggaccccggcgagggccaccacggcatcttgctccgggtccagcjcc 
cgtggt'ctcagtgtg 



B 



E.- coli: /1« FHVYLKPDRLVDPEAFCVHG I ADEFLL.DKPTFAEVADEFMDY I RGAELV 96 

K •* P R PA VHG+ E DKP V F +> « LV 

Tt ^ 150 FEALVNPGRPXSPAATAVHGLTAEMXRDK'PPLEAVLPAFRAFVODTVLVA 1 

^ FEA NP RP S + G+T +M *D P « V+ FR . D + LVA 

B. Subtil is -.455 FEAFANPHRPLSATT I ELTG ITDDMJ*QDAPDWDVI RDFREW IGDDI LVA 504 
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FIGURE 17A 



OOATCCCG^GCCTCTGGAGGTGCCCTAGAAOGGTCTCGGCCTCGGCCAaQGTCATCTCCCCOGCOAGGGC 

OTAGCGGTAAAGGGCGTCGGCCGCTTCATAAAGGGCCAGGGTGGGGGCGGCGAGGGGGCGGTCTTCCTG 

CCTCCAGCCCCGGTAGAGGCGTACGGCTTCGGCCCTTGGGTCTGCCGCCAAGAGCAGGCGGAGGAGGAA 

GCTAGCGTCC^GCACCACCCAGCTCACGGTCCCGCTCCTCCCTCAGGGCGGAAACCACCTCCTCAGGGGG 

OGTAAGGGOCCTGCCCAAGCGGGCGTGGATGGCCCGOTTAAGGGAGAGGAGGGCCTCCAAGGGGTCCTC 

CTTCCCCGCGGTGAGGCGGGCGTAGGCCTTCACGGAOAGGATGACCGCCTGCGGCCTGCCCCGCCCCTCC 

ArrACGATCACCTCCCCTTCCTCCACCCGCCTCAGCACCtCGCCGAAGTGGACGCGGGCCTCGGTGGCGCT 

TAAG ACGCGCTTGCGATGCATCATTTCATGCATTAGTTTAGCCGCCCCGGTCCCTCCCCGCAAGGTAAACT 

GGGGACC A Tnn r.rrr.CAAACTCCGCTTCGCCCACCTCr ACCAGCACACCCAGTTCTCCCTCCTGGA 

trrirtcz^rr^n cCi A aGCTTTCCG ACCTCCTC AAGTGGGTC A AGGA G ACG ACCCCCG AGGACCCCGCCT 

T^r.rrATr. »rrGACCACG GCAACCTCTTrGGGGCCG T GGAGTTCTACAAGAAGGCCACCGAAATGG 

r/-,T/-.,,r.rrr ATrrT flr,GCTACGAGGCCTACGTGGCGG CGGAAAGCCGCTTTGACCGCAAGCGG 

fy.t a aG GGGGT aGACGGG GGCTACTTTCACCTCACCC T GCTCGCCAAGGACTTCACGGGGTACCAG 

TG A AGGac'cGCTTGTTGATTGAGATCC AGAACCA CGGCCTCCCCG AGCAGA ^^^^^^^^'^t 1 ^^^.^ 
TrrT P a aGG AGTTCGCCCG AA AGTACGGGCTGGGGAT GGTGG CC ACCAAC GACGGCCATTACGTGA 

rrrGGGrGr TGGCGCTTCCCCTGCGACGAGTTCTA CGTGAAGACCCCCGAGGAGATGCGGGCCAT 
^TT ^GGOGAGG AGGAGTGGGGGGACGAGGCC TTTGACAACA C GGTGGAGATCGCCCGCATGTG^^ 
^GGTGGAGCTGGGG ATGGGGGACA aGATGGTCTACCGCAT CCCCC GCTTGCCCCT CCCCGC CCGT^ 
GGACCGAGGr rrAGTACCTCATGGAGCTCA C CTTTAAGGGGCTCCTCCGCCGC^ACCCGGACC ^ 

,r. r.rrrTr,GrCGAGGCCT Tr,r,CCCAGG TGGAGCGGGAGGCTTGGGAGAGGCTCATGAAGAGCCTQ 
rGGCCCTTGGGGGGGGTGAAGG^^^ 

a GaGGGTGTGG ATGGCCGACATTGAGAGGGACT TrTGCGAGCGGGAGCGGGACCGGGTGATCCAG 
TACGTGCGGG A> GGGT AGGGCGAGG aG a aGGTG<?CCCAG ATCGGCACCCTGGG AAGCCTCGCC'r ^ 
<~ a A/in.mrzrr nr a aGGa GGTGGCCCGGGTCTACGGC A TCCCCCACAAGAAGGCGGAGGAATTGG 
^AA^f-Tf-ATf- rGGGTG rAGTTCGGGAAGCCCAA G CCCCTGCAGGAGGCCATCGAGGTGGTGCCG 
G AGCTTAGGGGGG AG aTGGAGA AGG AGGGCAAGG TGGGGG AGGTCGTrGAGGTGGCCATGCGCCT 
GGAGGGCCTGa a.GGGCCACGCGTGGGTGCACG^G GGGGGGGTGGTGATCGCCGCCGAGCCCCTCA 
rGGACrT CGTCCGCCTCATGCGCGArrAGGAAGGG CGGCCCGTCACCCAGTAGGACATGGGGGCG 
> ptt!~ r rrrTTTTf: i a GATGGaGTTTTTGG GCCTCCGCACCCTCACCTT G CTGGACGAG 
GTG^gc^gcIx^X^GG^^^^ 

GGATGlrGGG rACGCTC 

TC^AC^GCCOrGGGGGG ATGGAGG A C ATGGCCACC TAC ATCCGCCGCC AGG A GGGGCTGGAG/CCCG 
T ^A^f- T ArAGr r-AGTTTGGCCACGCrGAGAAGTAC CTAAAGCCCATCCTGGACGAGACCTACGGCA 

T^G GGTGT^GGAGCA 

G^r^^C^^C^T^^^^G^GGCCA* I 'GGGG A aGAAGAA GCTGAGGAGATGG a ga agg agcgggagcgc 

TTC^4cAGGGGGG^ 

A^^inr^T^^^^^ a a^ta GGGGTTCAACAA ATCTCATGCAGCGGCCTACA G GGTGGTGTCCTACC 

, t rn,rTrrr.,rA aGGTGG GCGAGTACATGGGCGACGC G GGGGCCATGGGCATAGAGGTCCTTCCC 

CCGSTCMrr^ 
ggggtgaaaa aGGTGGGCGAG 

o^PJ^^^^TJ^r^ArTT rrTrAAGrGGCTGCACG AGAAGGTGC TCAACAAGCGGACCCJ^ 

gagtgc??^ 
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FIGURE 17A (cont.) 



GGAGGGGCTCCTCAGGTGGGC GGCCGAGACTCGGGAGAAGGCCCGCTCGGGCATGATGGGCCTCT 

TCAGCGAAGTGGAG GAGCCGCCTTTGGCCGAGGCCGCCCCCCTGGACGAGATCACCCGGCTCCGC 

TACGAAAAGGAGGCCCTGGGG ATTTATGTTTCCGGCCATCCCATCTTGCGGTATCCCGGGCTrCGf: 

GAGACGGCCACCTGCA CCCTGGAGGAGCTTCCCCACCTGGCCCGGGACCTGCCGCCCCGGTCTAG 

GGTCCTCCTCGCCGGCATGGTGGA GGAGGTGGTGCGCAAGCCCACGAAAAGCGGCGGGATGATGG 

CCCGCTTCGTCCTCTCCGACGA GACGGGGGCGCTTGAGGCGGTGGCCTTCGGCCGGGCCTACGAC 

CAGGTCTCCCCGAGGCTCAAGGAG GACACCCCCGTGCTCGTCCTCGCCGAGGTGGAGCGGGAGGA 

GGGGGGCGTGCGGGTGCTGGCCCA GGCCGTTTGGACCTACGAGGAGCTGGAGCAGGTCCCCCGGG 

CTGGACGAGCACGCG GGGACCCTCCCCCTGTACGTCCGGGTCCAGGGCGCCTTCGGCGAGGCCCT 
CCTCGCCCTGAGGGAGGTGCGGGT GGGGGAGGAGGCCTTGGCGGCCCTCGAGGCCGAGGGGTTCC 
GGGCCTACCTCCtGCCTGACGGGGAGGTCCTCCTCCAGGGCGGCCAGGCGGGGGAGOCCCAGGAG 
GCGCJ I GCCf IILI AUUOUO 1 UUUCCU 1 U AO AC AliOU 1 OCC.A 1 CU 1 (JC J CUCCUUUUUC AAUU AUUCC 1 

GGGCCGAGCGCTTTGGGGTGGGGAGCAAGGCCCTCGTGCCCTACCGCGGCCGGCCCATGGTGGAGTGGGT 

CCTGGAGGCCCTCTACGCCGCGGGGCTTTCCCCGGTGTACGTGGGGGAGA.^CCCCGGCCTCGTTCCCGCG 

CCCGCTCTCACCCTTCCCGACCGCGGAGGCCTCCTGGAAAACCTGGAGCAGGCCCTGGAGCACGTGGAGG 

GGCGCGTGCTGGTGGCGACCGGGGACATCCCCCACGTCACGGAGGAGGCGGTCCGCTTCGTTCTGGATA.A 

GGCTCCTGAGGCGGCCCTGGTCTACCCCATTGTGCCCAAGG AGGCGGTGGAGGCCCGCTTTCCCCAG ACC 

AAGCGCACCTACGCCCGCCTCCGGGAGGGGAGCTTCACCGGCGGCAACCTTCTCCTTTTGGACA.AGTCCC 

TCTTCCGGGAGGCCCTTCCCCTGGGCCGGCGGGTGGTGGCCCTGCGCAAGAGGCCTTTGGCCCTGGCCCG 

CCTCGTGGGGTGGGACGTGTTGCTGAAGCTCCTCCTGGGCCGCCTGTCCCTCGCCGAGGTGGAGGCGCGG 
GCCCAAAGGATCC 
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FIGURE 17B 



MGRKXRFAHI*HQHTQ FS LL DGAAKLS DLL KWV K^TTPE1DPAIJCMTDHGNI«FGAVEE*YKKATEMG I K'P I L G Y EA Y V AA E S R 
POP, KP.O KOIiDOOTr KIjTIjIjJlKT3rT<3X Qm/\nftI*A3 KATCTjEOrXEKP P* I D Pv E I L Px E H A EGL I AL 5 GC L'^.EIF v flLC'DP.LC' 
L A E A RL H E Y LS I F KIT F. F F I E I ON H GL P EQ KKV N EV L KE FA P. K Y G LGMV AT N DGH Y V R KE OA RA H E V L L A I O S V. > T L O C > P G 
RW R F PC D E F Y V YCT P Z EM RAM F P E E E WG D E P F DH T V E I A RMC H V E L F I G D KMV Y R I P R F P L P A R RT E AO Y L M E L T F KG L L, P. 
RYPDRITEGFYREV FP.LLGI-CLPPHGDGEALAEALA.QVEREAWERLMKSLPPLAGVKEVJTAEAI FHFALYELS7 I ERHGFP 
G Y FL I VQ D Y I N W A R R N G V S V G P G RGS AAGS L V A Y A V G I T 1 4 1 DP L R FGL L F E R FL N P E R V S M P D I DT D F S D R E r. C' R V 1 0 Y V 
RERYGEDKVAQIGTLGSLASKAALKDVARVYGI PHKKAEELAKLI PVQFGKPKPLQEAIQWPELPAXMEKDr KVREVLE 
VA>1RLEGLNRHASVH^GVVIAAEPLTDLVPLMRDOEGRPVTOYDMGAVEALGLLKMDFLGLRTLTFLDEVK?JVKASQG 
VELD Y DAL P L DD P KT FALLS RG E T KG V FQ L E S GGMT AT L RGL KP R R FE DL I A I L S L Y R P G PM E H I F T Y I R R K H G L E R V S Y 
S E FPHAEKYLKPII»DETYGIFVyQEQIMQ3^ASAVAGYSLGEADLLRRAMGKKKLRRCRSTGSAS S RGP R KG AC ? RRRPT A 
SLTWLEAFAWYGFN KSHAAAYSLLSYOTAYVKAHYPVEF^W^LLSVERHDSDKVAEYI RDARAMGI EVLPPDVIIRSGFDF 

LV^'jp^ILr'jLSMVr - C Y » j EWiAth ILP.EKC IVjGP T F\i L O D r L KFvL D £ KV L, N I\" KT L C 3 L I K 3 AL L" 3 r'jt^.MLASLll'jL 

LRWAAETPXKARSGN1?*1GLFSEVEEPPLAEAAPLDEITRLRYEKEALGI YVSGHPILRYPGLRETATCTLEEL? HLA»RDLP 
PRS RVLLAGMV E EVV RKPTKSGGMMAR FVLS DETGALE AVAFGRAY DQVS PRLKEDT PVLVL AE V ERE EGGV ?.7L AOA7 W 
T Y E EL EQV P PAL EV E V E AS LL DDRGV AHL KS LL DEH AGTL PL Y V RVQGA FG E ALLAL REV P.V G E E AL AAL E AI 3FRAYLL 
PDREVLLC"3GC»AGEACEAVPF 
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FIGURE 17C 



Tth 

Borrelia 

K . col i 

B. subtil i 
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Borrelia 

E . coli 

B . subtili 
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E.coli 
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E . coli 
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E . col i 
B . subtili 
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E.coli. 
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Tth 

Borrelia 
E . coli 
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Tth 

Dorrclia 

E . coli 
B . subtili 

Tth 

Borrelia 

K . coli 

B . subtili 



1 .fflGRKii 

1 MLKIKFYFDKIILGJJSFRS^ 




151 vdecSaQ 

221 ARggp 

232 ATTOt 

210 FD[" 

202 KAgjYRC^ 



lIRT^PHEESYLHAAflETOEARgF^^ 




paedlpdlET ■ li 

M3N . QE2PFJ33KMk7YR I ^R I^^^'j ARR . nEAQjJlMQjT 
EEDDFKfprFGPIg. . .gDjgS^ErNQLGgJjEHHT 

^^gg^YFffl 
ffiDrtSfffSoTRi? 



jjTGDMS . - Tgj 

. rasnRrqpGT . 0ai33 




?LRi^V?PDRITEGFYREVF 

tF^KN 

lEEgEAFLFPDEE. .... 



3 3 6 RLLGjgBPHGDGEALAgALAQVEREAWERLMKSLPPLAGVKEWTAEAI FHjjjAI 

332 .LTS3J< M .g AF 

313 ERLK^a FV ™ P 

2 98 PDERYjjR 
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FIGURE 17C (cont.) 
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FIGURE 17D 



MAGCLNDIFEAQKIEWHHHHHHLVPRGSGGGGLQNLGRKLRFAHLHQHTQFSLLDGAAK 

LSDLLKWVKETTPEDPALAMTDHGNLFGAVEFYKKATEMGIKPILGYEAYVAAESRFDRK 

RGKGLDGGYFHLTLLAKDFTGYQNLVRLASRAYLEGFYEKPRIDREILREHAEGLIALSGC 

LGAEIPQFILQDRLDLAEARLNEYLSIFKDRFFIEIQNHGLPEQKKVNEVLKEFARKYGLGM 

VATNDGHYVRK.EPARAHEVLLAIQSKSTLDDPGRWRFPCDEFYVKTPEEMR-AMFPEEEW 

GDEPFDNTVEIARMCNVELPIGDKMVYRIPRFPLPARRTEAQYLMELTFKGLLRRYPDRITE 

GFYREVFRLLGKLPPHGDGEALAEALAQVEREAWERLMKSLPPLAGVKEWTAEAIFHRAL 

YELSVIERMGFPGYFLIVQDYINWARRNGVSVGPGRGSAAGSLVAYAVGITNIDPLRFGLLF 

ERFLNPERVSMPDIDTDFSDRERDRVIQYVRERYGEDKVAQIGTLGSLASKAALK.DVARVY 

GIPHKKAEELAKLIPVQFGKPKPLQEAIQVVPELRAEMEKPPKVREVLEVAN4RLEGLNRHA 

SVHAAGVVIAAEPLTDLVPLMRDQEGRPXrTQYDMGAVEALGLLKMDFLGLRTLTFLDEV 

KRIVKASQGVELDYDALPLDDPKTFALLSRGETKGVFQLESGGMTATLRGLKPRRFEDLIAI 

LSLYRPGPMEHIPTYIRRHHGLEPVSYSEFPHAEKYLKPILDETYGIPVYQEQIMQIASAVAG 

YSLGEADLLRRAMGKKKLRRCRSTGSASSRGPRKGACPRRRPTASLTU'LEAFANYGFNKS 

HAAAYSLLSYQTAYVKAHYPVEFMAALLSVERHDSDKVAEYIRDARAMGIEVLPPDVNRS 

GFDFLVQGRQILFGLSAVKNVGEAAAEAILRERERGGPYRSLGDFLKRLDEKVLNKRTLES 

LIKAGALDGFGERARLLASLEGLLRWAAETREKARSGMMGLFSEVEEPPLAEAAPLDEITR . 

LRYEKEALGnvSGHPILRVPGLRETATCTLEELPHLARDLPPRSRVLLAGMVEEWRKPTK. 

SGGMMARFVLSDETGALEAVAFGRAYDQVSPRLKEDTPVLVLAEVEREEGGVRVLAQAV 
WTYEELEQVPRALEVEVEASLLDDRGVAHLKSLLDEHAGTLPLYVRVQGAFGEALLALRE 
VRVGEEALAALEAEGFRAYLLPDREVLLQGGQAGEAQEAVPF 
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FIGURE 18A 



AGGGAAGCCG ^CTCG77CTCCTCCTCI!CCGCCCGCAAAGCGCCCCCCGAGGC^ 
CCCAGGGCO"GGTCTACGACCTCACCCTCCTCCA^ 

GCTCCTTTACACCGCCTTTGACCTGGAGACCACGGGCCTGGACCCGGAGCAJ^GATGCCGT-Z-G.TGGCC' 

CCC- I-GGTCCA7ATCC7 :-GGTCGAAGGGTCTTGCGGCAGGAGGTGTTCGAGGCCC^ 

CCC A 7C TCC C C C GC G GC C AC GGC G G TCC AC GGCC TC AC G GC GG A G A T GC TCC GGG AC AAGC C 7CCC C 

GGC :-iTCCTC7CCGCC7TCCGCGCCTTCGTCCAGGACACGGTGCTGGTGGCCCAC.A"j7G 

TGGCCTTTC7 I-CGCCC- :-GCGGGCCTGGACCAGCCCCCCCTCCTGGACACCCTCCTCCTCC-7CCAGC7 : 

TTC;7CCGACC7CA^A':'GACTACCGCCTCGAGGCCCTGGCCCACCGCTTCG t SCGTCCCCGCCA'7CGGGC- 

C>.~C 3CC7T 3 3 3C-3>.T 3COCT-3 -.T-jACOO-'-TG-j . t .'3'jTCTTC'3T'2. t .O'3. L .T-3C. , '.OCCCCTCCT CTTT-jV: 
'GGC ■. - .-.GGCG -Ci'C'l G iG.".CGl'GGl'GGAGGCC"i*">JCiiCGACl'CL'C'7 , J , 'rGGC'CG 
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FIGURE 18B 



GSRLVLLLXARK_A.PPEALRK_APAPRALVYDLTLLQ\'PAGLEDVTLDQLI.VTAFD 
LETTGLDPEQDA\"VALAGVHILCiRRVLRQEVFEALV>JPGRP1SP,\ : ATAVFIGLTAF: 
MLRPK^Pl.EAVLPAFRATVX5DTyLVAHhfGAFDLAFLRRAGLDQPPLt-DTLLL.AQ 
LLFPDLXm-RLEALAHRFGVPATGRHTALGaALMTAEVFyRMQPLLFERGL.RRL 
WD\ VEACRDSPWP 
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FIGURE 18C 



AIPAGLEDVTLDQLLYTAFDLETTGLDPEQDAVVALAGVHILGRRVLRQEVFEAL 
V^PGRPISPA-ATAV'HGLTAEMLRDKPPLEAVLPAFRAFVQDTV'LVAHNGAFDLA 
FLRR.AGLDQPPLLDTLLLAQLLFPDLKDYRLEALAHRFGVPATGRHTALGDALM 
TAEVFVRMQPLLFERGLRRLWDVVEACRDSPWP 
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FIGURE 18D 
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FIGURE 19A 



CTGATGCACGCCGTGGGCCACTCCGTGGCCAAGCGCTTCCCCCACCTGAAGA 

TTGAGTACGTTTCCACGGAAACTTTCACCAACGAGCTCATCAACGCCATCCGC 

G AGGACCGGATGACGGAGTTCCGGGAGCGGTACCGCTCCGTGGACCTCCTGC 

TGGTGGACGACGTCCAGTTCATCGCCGGAAAGGAGCGCACCCAGGAGGAGTT 

TTTCCACACCTTCAACGCCCTTTACGAGGCCCATAAGCAGATCATCCTCTCCT 

CCGACCGGCCGCCCAAGGACATCCTCACCCTGGAGGCGCGCCTGCGGAGCCG 

CTTTGAGTGGGGCTTGATCACCGACATCCAGCCCCCCGACCTGGAAACCCGG 

ATCGCCATCCTGAAGATGAACGCCGAGCAGGGGGGGCTGAGGATCCCCGAG 

GACGCCCTGGAGTACAT.CGCCCGGCAGGTCACCTCCAACATCCGGGAGCTGG 

A^GGGGCCCTCATGCGGGCATCGCCTTCGCCTCGCTCAACGGCGTTGACGTG 

ACGCGCGCCGTGGCCGCCAASSSSTCTCCGACATCTTCGCCCCCAGGGAGCTG 

GAGGCGACCCCTTGGAGATCATCCGCAAAGTGGCGACCACTTCGGCCTGAAA- 

CCGGAGAGCTCACGGGGAGCGGCCGAAGAAGGAGGTGGTCCTCCCCCGGCA. 

GCTCGCCATGTACCTGGTGCGGGAGCTCACCCGGGCCTCCCTGCCCGAGATC 

GGCCAGCTCTTCGGCGGCCGGGACCACACCACGGTCCTCTACGCCATCCAGA 

AGGTCCAGGAGCTCGCGGAAAGCGACCGGGAGGTGCAGGGCCTCCTCCGCA 

CCCTC.CGGG A AGGCGTGCACATGAACCCC TGTGGATAA CC TGTGGATAA CC 

CTGTGGATAACCGGGRCTCAAAGCTGTGGATAACCCTGTGGATAACCTGTG 

G A A A AGTCCGGGGGGGCCGAACCCC TGTGGATAA CCCGCACT TTATCCACA 

GG TTATCCACA GGCCGGGGCCAG TTATCCACA GCAGTTAATCCACATGCCC 

CTTTCTTGTCCTGGAGGGAGTTTTCCACCGCCTTAJTACACAATTCTCTGCACA 

GGACCTACTACTACGACTACGTAtCTTTTAAAAGTTTTAAAAAGAGAGTCGTA 

ATAGCCATTAAAAAAGGCGTCGGGCAATGAACAATAACGGTTCCCAAGAACT 

CCTCTAGGACCAGCTTTCCCTACCTGGTGAGATGT 



BNSDOCID: <WO 9913060A1J_> 



WO 99/13060 



33/35 



• 



PCT/US98/18946 



FIGURE 19B 



LMHAVGHSVAKRFPHLKIEYVSTETFTNELINAIREDRMTEFRERYRSVD 
LLLVDDVQFIAGKERTQEEFFHTFNALYEAHKQIILSSDRPPKDILTLEA 
RLRSRFE\\ r GLITDIQPPDLETRIAILKMNAEQRGLRIPEDALEYIARQYT 
SNIRELEGALMR.ASPSPRSTALT+PAPWPPXXLRHLRPQGAGGDPLEIIR . 
KVATTSA*NRR.AHGERPKKEVVL^ 

pHTTVL^'AlQK.VQELAESDREVgGLLRTLKEGVHMNPCG 
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AA \A AG AGAGTCGT AATAGCCATTAAAAAAGGCGTCGGGCAATGAACAATA 
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CCCTCT AG AAGCGCCAACCCCC'i C 1ACACC 1 ACC 1 GGGGC 1 1"1 ACGCCGAGG 
AAGGGGCCTTGATCCTCTTCGGGACCAACGGGGAGGTGGACCTCGAGGKCCG 
oCTTCGCGaCG AXjGcCC AA AGCCTTlCCCGGGTGCTcGTcCCCGCCC AGCCCTTC 
TTcC AaCTGGTGCGGAGCCTTYCTGGGG ACCTYGTGGcCCTTGGGCTTGCCTTG 
GAGCCGGGCCARGGGGGGCARCTGGAGCTCTTCTCCGGGCGCTTTCGCA1CCG 
■GCtcAACCTGGCCCCTGgCGAnGGCTACCCCGAGCTTTTGGTGCCCcAgGGGGA 

GGACAAGGGGGCCTTTCCCCTKCGGACGC 
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FIGURE 6 
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FIGURE 7 
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A FIGURE 9 

Tf X'JGCC AGGCGCTT TCCCCTG AGGTAGAGGGCCATCTCCACC A CCT< -G AGCTCCTCC 
GGG<KrcaTCACOc:CiAA<X*:CttX(;AWA*^ 

r ^AGCCCCG ATCTCGGGCCCGCCCCCGTGGACG AGAACCAAGf.HiAt'X^GGGT AUGC 

CKx;AAGcrcGTCCAArAGGY»:c7e^^ 

CGCCCfCACTCAA GG /AGACCACCitXTiGGCGCA GGCCCTCCTCCTCCTC.GGCClCCAG 
GAGGTCGGCGAA GCCCAGAAGGGCGTX^TCCGGGTUGAGGCCGAAGCGGGCGGCGA G 
GGCCXttACCGCGJUCTOXiGCGC& GGGGGGCGAG 
GTCGGCCGGGGC.CTTCaGGGCOJTXX^CAAGCICCGGACCCGCGCGAA GCTCCGCCC'i c 
A CCTCCCCCCCCTCGGCCAGGAGGAAGAGGAGG TCCTCCTCGCTTAGTA CCAGGAAGGC 

caaggcc7xjccctaggcgggaaagc7GCCGCag&aagkxx!CGGatggcctccggg rcc 

IGC'fCCGCCfCGA GGGCCTCG TGGTAGAGGCTCGTCCAAGGGGGGTCGGGGACCAGG T 
AGACX'CCCGTCCCTCCCGTGCGGGCCTCGGCCCAGGCCGCCACCGT.CTGCAGGGGGGC 
CTXjCAZSGTGCAAGGA GA GGAAGCTCCGCACCACGCCCTATA CTAGCCCTT&IiiAtjCGtLt.'. 

orcrACCGcax:TTa;cxxctXTCA^ 

AA<K?AGCCCCTCCTC/\AGGCCATCCGGGAGGGGAGGCTCGCCC/ t GGCCTA(r:c:n:rr<T 
TCCGGCrTCCACrtXKiCCiTUti^ 

GG<3GTGCCAGGGGGAAGACCCGCCTTG<XiGGGTCTGHriOCCCACTGCCAGGCG<j'TCK.'A 

OAC<KKrreoceeAecc:GGACGTcxn 

AGGACGTGCGGGAGCTGAGGGAAAGGATCX2ACCTCCrOCCOCC'n'TCTGCCCCCACKJA 
AGO' IC TTC A TCCTGGACG A GGCCCAC ATGCTCTCC AAAAGCGCC7TC AACGCCCT CCT 
CAAGACCCTGGAGG AGCXlCXlCGCCCC ACOTCClCnCGTCTmCiOC ACC ACCG A GCC 

CXi AO A CtCj atoccccccaccatcctctcccgcajcccagcacttccgcttccgccgC ctc 

ACCGAGCAGGAGATCGCXinTAAfiCfOCGGCG^Trc^ 

CX/CGGACKJAGGAGGCCCTCCTCCTCCTCGCCCGCCTGGCGGACGKjGGCCCTTAr/GGA 
.CGCGGAjVAGCCTCCTGGAGCGCTTCCTOCTCCTriGAAGGCCCCCTCACCCGCAAC3<3A 

rrGrGGAGC(:KX*;CCTACK.KCTCC^^ 
CCTCGCCAGGGGGAJUW\CGGCGGAG<30CC1VKXH^ 

AA^rOGTA CGCCCCG AGGA GCCTGGTCTCGtKKICTTTTGGACGTGTTCCGOG A AGGCC 
TCTACGCCG<XTTCC^CCrCGCGGGAACCCCfXni-CCCGCX;<XGOCCf:AGGCGCTGAT 
CGCCXiCC A TC ACCCCCCTOGACGAGGCCATGG AGCGCCTCGCCCGCCGCTCCGA CG<: 
CTTAAttOr: I GG A (Xi-lXXiCtX: TCCTGG AGGCGCKjA ACKtGCCCTUOOCGCCO a ggccct 
GCCXXXGCCCACGGO/CGCTCCCCCCCCAGAGGTr^^^^ 

GGCCCX.XKiAACCCCCAAGfX:CCOAWAG<JCCK:CCGACCTGCGG</AGCGOTOG<;(iC > rG 
CCTTCCTCG AGGf ICC 1 C AGGCCC ACCCTaCGGGCCTTCG TGCXX rG ACK jfiCCGCCCCG 

GCAAGGCCr CGG A A CIA Cr A A (.i<>CG ArrTrCl-CCT OCf ;CC']'<oG<.XX A< JGCt7C ATTTI IfSGGC 

tT/Cagga gg rc<;n c:r CG i C <r. i gg aggg ag aa a^aaaa aoCctu a ck.cc a aggccc 

C'.tTCTCGGCCC C ACCtCC VGA AGCGCCCCi^ACCCCCCrGGlXt I'CCCO A (.Kr AGG AGCT a 
G AGCJCG'G A G G A AGCGGCGGAGG AGGCCCCGKi AGG AGGCCTTG AGGCGGGT C< 3T r:c 
OCCT CC TGGG fH iG ( i< : OGG * G CMC. TGGGIGCGGCGGCCt : AGO A< :t : CGGOA G GCGCC <j 
G AGGAGG A ACCCf. 'I 'G AGCC A AG ACG AG AT AGGGGGT ACTGGT AT AX^1 % GGGGGC : A 
TG ACGCGG ACC ACCG ACO CGO AC A AG AGACCtiTGO AC A A (.: ATCCTC AAGCG C CTCC 
CCCG T AT TGA ( XK* : ( : A GG VGOGCi GGGCiCt: AAAAG ATGGTGCCCGA ( jG G CCGCCCC . 
TGCG ACG AGG rCCTC A CCX A G ATG ACCGCC ACC AA (j A A f .K jCC ATOC a GGCGG CGC C 
C ACC X.TG A TC CTCC A ( :G AG TTCC I G A AtrGTCTGOG CCGCCG AGGTClTtTC :Ci AGG GCA A 
GGTGAAO(XC A A GA AGCCCG A GG AG AIT! GCC ACC ATGCI G AAG A AGT7 CATCTA G AT 
CGGTCGGCT T CGCGGGCGCCT C<:< JGCGCCTCt ITCCCJGGCCC TTCTCGCCC A GG A GGC 
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FIGURE 9 (CONT.). 



S A UV It R FEtPLTFQE V VGQkH VK ^.PLLiUdPJ}Gl{l^A0Ayi.ySQPRCrVGKT[TAR IX AMA 
v rcQGED^GVC*H<XMVQRGAfIP^^ 

FCLOE Al *n*L$KS A FNA[ J .K TLEEPPPI fVLS- VP ATTEPF-RMPPmSRTQBlFRrRJkLVEHEIA 
HOJ<i^*AVGJlFAEEEAXLXL\f^ 

HA L A fctAAS LARGKTAILMjGt AR R iLYGEGYAPRS L VSOtXE VFREGLY AAFGLAGV P[ .p A 
PPQ ALt AAMTAJJ? FA (v*FRLARJl5 DALS] .F V ALLEAGJL%LAAJL*LPQPT(.iAPPPEVGPKPE 
SPPAPHPPI^EEAPDLRErlW^LeA]^^ 

KASEQKA KLI J> I ,A OA H FGVEE WLVUXiR K ft: S L5 PRPRS APPPEAP APPC.rPPEF-FVE AE 
£AA££H A PFEALRA VVRLLGGk V [ .WVRRPRTHE AJ^ lie L-'Pi .SQDF.I UfJTOI ' 
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FIGURE '9 (CONT.) 



GAAAAAAA^GCCTGAGCCC AAG GCCCC G CT C GGCCC C ACt TCCTG A 
E K K 5 " ]. <S P R P R S A P i> P 
-1 E K K K P £: . I 1 K A P L C P f S 
+ 1 E K K A 



BNSDOCID: <WO 9913060A1TI_> 



13/35 



PCTAJ59S. r lS946 



A 



G£IIECIC±JA^G£C™^ 

AAAACCTTCCTGQGGGCANAAAGGGGGGCGAGG7GGATCCTJTCCCTCAGCT 

G ANTTtjrFGGT0GC(JC3tXTTCMTCTCCAC( :ACCTCCGGfmCiGGCGCCCCTCr< !C ACCG CCTGGCA<3TCt<jG 
GGCANACCCCG<^GG<XK3GTCrrC£tX;CT^^ 

TCTTGCCC ACNCCCXTTCGGCCCGGA AA AAAAGTNGCCTGGGCGAACCTCCCCTCCCGGATGGCCTTGAN 
C> ANGG G CTCCTrCACNTGCTCCTGAC CCACAACC lt riTGAA A . 



I* 



?D3 0?PraVCKJ^V0>S^FDWK2DW^!e&x£xx^^ 3« 

p TC C C4< <St DVt-EIDAHSN ♦ W» KV +01: K 

70 DBTCSfEC^ACWC 1 TNGS J SDVI Bl D>ASNMGV&£lKDI KDKVXPAPKftVTYXVY I IQTtfH 11 J 

1JD >{LtilCiAFttKLL a. mibeil/f 
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FIGURE 11A 





Start 
Pruitlon 


Sequence 


SE<J ID NO: 


JW. tuberculosis 


11) 


FVHL.EINHTE 


SEQ ID NO: 27 




P HLK HT+ 


SjBQIDNO:29 


T- thermophilic \ * 


RPAHLHQKTQ 


S>RQ ID NO:26 




RF EL - [IT 


SEQ ID NO:3fl. 




6 


RFIHJJRlTHTD 


SEQ ID N0:i8 
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FIGURE I1B 




Organism 


Pfinitioci 


Sequence- 


SKQ IT) NO: 


T. thermophilic 


91 


X FTGYOH LVRL A3RAYLEG P YEK 


SBQ ID NO:3( 




TOYQNI/ I. S+AY « + 


SEQ !D NO:33 


coii 


92 


TGYQN LTL.L1 SKAYQRGY 


SEQ ID NO:32 




T. thermophilic 


676 


w * 3 g 
p I LDETYG 3 FV YOEQqMQi AaavaxY 


SEQ ID NO:225 


SEQ ID NO: 226 


, SEQ ID ND:34 






+L4 TYGI MQIA 


SlfQ ID NO:3S j 


Kccit 1 676 


VLEPTYGI I J< Yf?fiQ WLQ 3 A 


SEQ ID >JO:35 




T. therrnophilus 


8 S3 


a 

GLDGGYFHLTlfd 


fcBQ ID NO:227 


SEQ ID NO:37 


■+GGYF+ 1FD 


SEQ ID NO:39 


EL coJi 


853 


RHKGGYPR-ELFT} 


SEQ IDNO:3S 
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FIGURE 12 



IB) 



AACCTGGTG CG CCTGG CCSAG C CGGGCrrTACCTGQRagaSTgT J^CGa^ &fe 



SynechoCy-12 VSLLDGASQLPALJ. 25 

+slloea+-»-l . [/+ 

Tth a FS1 jLCGAPiKLSX LLJ<WVK 55 

HS+UJGAAK++ + !-> V+ 
M. tuber ID VSHLDGWUUTPNLATWE 36 

Tth 7 5 P AU<MTDHGMFFG^XF^KKATEMCIKPlLGYEAXVJUiEEKPDRKR 212 

PA+ KTDHQH *GA FY AT+ BIKPI+G EA +71 SRF £J* 
■ H. .tflb«r 4] PAVG«TDHGHMFG^KP¥HSA^GIKPnGVEAYTAPGSRE^T^ Bfi 

. * , iV : ....FH *£+ , . GY4Nt.V+L^+ ^ . . . . -> .. , • f . - 

Tth 22a GG Y PHXTX X AKDFTGYQN LVRLA SUA 30* 

G Y H T A+4 TG +WL. +L+S A 
H tuber 103' GSVTHl,T«r«EMArCWl«LFKI*aBA 12 B . 
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FIGURE 13 



(A) 

a-TV-GCCGRC AAACTCCGCTTCXtCCC ACCTCC AC WG C ACACCCftGTTCT CCCrCCT^GA CGOGGCGC CGAAkC 



SJSrrc^AJ^GACTOCACCG^^^ 



AG GTCCTCA AGGANTTC<jCCCG*^AANrACfiGCCTX3GGGATCGTC»GCCAuu^ 

CWTTTAAGGGG CTCCTCCGCCGCTACCCG6 ACOGGATCiVCCGAGGG GTrCTACCGG^AC^TCrTC CG CCTTTT 
GGAC-AAGCTCATTG 




M -.tuber 1 MSGSSAGS 

Tth ... 1 



M- tuber 54 
Tth 51 
Ryn. sp , 4 *? 



Tth 55 
Gyn - - 



syn.sp. 139 




CLGjiE;i|f'Q 

clggevIpo 
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'OTGURK 1.4 



<A> 




^GGCA^AAGCCCATCG^G^ 



CTGGACCTGGCOGANGCCCCGCrCAACAAeTACCT' 



(B} 

^GCGGCCTTGGAGGCGAGG^rlNcCCAGGGTGCCGATC^^ 

CGTACI'GGATCACCCfjGrCCCGCTCCCGCTCGGAGAAGTCCGTGTCAATCTCGGGCAl'GGAGACCCTCTOGGG^ 
GTTCAaCAAGCttCTCAAACACMAGG^ 

AG G CTCCCG CCCaCeCTCCCCCTG COGGG tXICCACCG AG ACGCCGTTTCTCCGGG CCC AGTTG AMI ATTCCT 
GG ACG ATG AGC5A AGTAGCC^GGH AAACCCCATG C(S CTCT ATC ACGGG &AAGCTCGTAAAC CCCCCGCTCCA AA/ 



Ttl> 



393 GXPG* PLrVOSVlHMARRMGVSVGPGRGSAAGSWAY AVGITNIW .kPCLLFB^LHP J 2** 
E.cplj 336 G FPGY F I VHE^TQ WS RUN G VPVG VGRG SG ftG£jLV A Y Al-K. I TDLDP] 'EF DLLFER i LWPt 395 

E.COli 306 FVSMEBf^VD^KBKRDQVI EHVADMYGRDAVSQI ITPGTMAAKfttfTRDVGRVLGHPVG.F' 4*> 

TUi 133 ABELAKl'IP 7---l'fch base/ 

. + i +KLIP , " , , 

E Crtji *56 VSlRi.S'Kl-lP 4fi4 B. coll amino acid number 
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A 



■ rrnAAG <7T«Ttv. ; aa a Ac^rrcc t'CCTGGGTCcacr^crr i ■ itx:GGCGATc; aaCTGGaC0t< x> nc< 
accaccagg* octcca cc>: 3ac3(^cjf 2'ACCGcrccocr; a a cr Lccrc atCtog-i cci .a:i 
GQc:crr rc atGaoci'CGTTGGTC a AAGTn(x:trtiGGAAAC^T>. rnzAA-fCTTCACurrCGGGGA 



BNSDOCID: <WO 8913060A1TI_> 



WO WV13060 



20/35 



PCT/US**6/1SM<5 



-1 


B «4 

22 


9<* 


r-. 

go 1 


^ 9 






f~| go 
1=1 oi 


ft 


Ro 

LU 








p* 
fa 

o« 


[T, 

o 


t— i 

Pk 

g 


+ 
r?. 
O 

4 


o 
















K 

8 

i— i 

1 
1 

M 


•1- 

B 

-4 . 
>*. 

4 
Ph 

w 
+ 

+ ■ 
+ 


w 

[rl 
M 

E. 

E-i 
K 
H 

CD 

M 

* 


Pi 
4 

1 1 
4- 

> 

o: 

PS 

+ ■ 
•J' 
+ 
Pi 

1-4. 


Ft 

PL, 
P 
> 

a 

n 














4* 
«* 

, i 
' t. 

. ■ 


I 

Pi 


-I- 

4 
4 

4 1 

tu 

til 
+ 

+ ■ 

4> 

4* 

as 


.4 ■ 

+ 

4- 


CO 
& 
i— i 

P-l 

&! 

s 

Ci. 

2 

w 












■ 






+ 

1 

+ 


s 


i> 

EC 
t5 
4 


■fc.' 

8 

i 




w 

Of 

B 
9 






\ w 

O! 
H 

■ 4- 


W 

s 

a. 


S'l 


GO 




> 








































i- 

© 


H 




3 

J 


n 

Ju 


1 

3- 


i 


8 
»5 




1 
I 




•SJ 

I 



BNSDOCID: <WO 9913060A1TI_> 



21/35 



FIGURK 16 



LCX>CGGGGG>J^A"fGGGGCGCCCCGGGTTCAC 

CTTOGApCX^iU iATATCGACfJCt.CXiCfiAGGGCCACCA CGGCATCTTGCTt;(;C^<ri tCCaGQCC 
J ifirGGTCTCAGTGTC 



K . r. n JL i : 4 0 KM LK PQ M . VD PEA i*L VWC J flDKK K PTl'A K VftDEfMDY r n<lAELV 9 6> 

y * P R L' ft VHf,* ^ UKf V 7 - i i LV 

rs . i 1 1 = : a S i rr./i F/i wr hrflsm X i k i .to 3 Tr>n« J PD WD v t rj>pbjsm [ UVD J LV/t iftfl 
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FIGURE 17A 

GGATCCC G AGCCTC TGG AGG TGCCC T AG AAGG GTCTCGGCCTCGGCC AGG QIC ATCTCCCC GOCO AGGG C 
GT AG CG<3 T AAAGOOC GTCGGCCGC TTC ATA AAGGGCCAC GGTG G GGGCGGCG AGG GG GCGQTCTTCCTG 
CCTCC AG^C~C GGT AG AGOCGT ACGCCTTCGGCCCTTG GO TC TOCCGCCAAG AGC AO GCGGAGG AGG AA 
GGTAG CG TCC AGC ACC AOCC AGC TC ACOOTCCCGCTCCTCCCTC AGGGCG G AA ACC ACC TCC TC AGGGG G 
GO T AAGG GGCCT lSCCC AAC CGGG CG TG GATGGCCCGGTT AAGG G AGAGG AGG GCCTCC AAOOGG TCCTC 
CTTCCCCGCGGTGAGGCGGGCOTAGGCCTTCACeGAOAGGATGACCGCCTQGGGCCTGCCCCGCCCCTCC 
ArCACGATCACCTCCCCTTCCTCCAGCCGCCTCAGCACCTCGCCGAAGTGGAMCGGGOCTCOGTGGCGCT 
T AAGACGC OCTTGCG ATGC ATC ATTTC ATGC ATTAGTTT AGCCG CCCC GGTCCCTCCCC OC AAGG T A AACT 
daca ACC ATQSOC CGCAAAGTCCGCTTCG CCCACCTCC ACC AGCACACCC AGTTCTCCCTCCTGG A 
CGGGGCGGCGAAGCTTTC^ACrTCCTC AAGTGGGT CAAGG-AGACG ACCCCCGAGGACCCCGCCT 
TGQC CATGACCGAOC ACGCC A AC CTCTTCGGGGCCGTGG ACTTCT ACAAGAAGGCCACCG A AA.TCG 
&clTC AAGCCC ATCCTCC^CT ACGAGCtCCT A CQTflGCGG CGGAAA6CC fiCTT?fi ACCflCAAGCGG 
GG AAA GGGCCT AGACOGGGGCT ACTTTCA CCTC AC CCTCCl CGCC AAGG ACTTC A CGGGGT ACC AG. 
Tat r rar.-r^rct ry^r. a n rrr/;rtrTTA crj v zn Ar=r,^rtTTTTA rctAA A. Aft^C CCGGft TTT i A r Hfl 

CCC AGTTC-ATCC TCC AGG ACCGTCTGG A CCTGGCCG AOGCCCGC CTC AACGAQTaCCTCTCC ATCT 

Tc"AAGGACCr^TTCTreATT-GAGATCCAGAACCAOGG^ 

TCC^AAGG^TTrnc^nGAAAGTACei^-CTi^ 

aGAAGCACrWcCCGCC 

CCCGGGCGCrCCCGCTTCCCCTGCGACGAGTTCT ACGTOAAGACCCCCGAGGAGATGCGGGCCAT 
QTreCCCGAGGAGGAOTX^GG^ 

ACGTGGAGCTGC CC ATCGGGG AC A AC ATCGTCT ACCGC ATCC CCCG CTTCCCCCTCCCCGCCC GTC 
GGAC^GAGGCCCAGTACCTCATGGAGCT CACCTTTAAGGGGCTCCTCCGeCGCTACCCGGACCGGA 

AGGCCCTGGCCGAGGCCTTGGCCCAGGTGGAGCGG GAGGCTTGGGAGAGGCTCATGAAGAPCCTC 
CCCCCCraGCCGGGGTeAAGGAGTCGAOGGCGCAGGCCATTTT CCACra^ 

l^CTCATACACrftr? ATCCCCI'TTCCCCCCTAfTICCTC ATCCTrrACCACTACAITCAA fTCCnq; 
CGGAGAA.ACGGCGTCTCCGTGGGGCCC PGC:AnGGGGAGCGCCGCCGGGAGCCTGGTGGCCTAC£ 
CCGTQGGG ATC ACC AACATTGACGCCCTG CGC-rrOGGCCTC CTCTTT GAGCGCTTCC TG AACCCCG 

agaoggtctcc ATccccG^fcC attcac 

TACGTG^^AAeGCTACOGCGAGGACA ACGTC^ 

C AAGGCCGCC CTCAAGCr ACCTGG CCOQGGTCTACGGCATCCC CC AC A AG A AGGCGOAGG AATTGG 
CCAAG^TCATCCCGGTGCACTTCGGGAA GCCCAAGgCCCTGCAGGAGGCCATCCATOTGGTG^C 
GAG CTT AGGGCSGAG ATOG AGAAGG ACCCC AAGGTG CGGG AGGTCCTCG AGGTCCC C ATGCCCCT 
GGAG^GCCTGAACCGCCACGCCTCCGl^CACCCCGCCGGGGTGCT GATCGCCGCCGAGCCCCrCA 
CGGACCTCCTCCCCCTCATGrGCCACCAGGAAGGG CGGCCCGTCACCCAGTACGAerATGGOGGCG 

GTGGAGGCCTTGGGCCTTTTG 

GTCAAGCGC ATCCTC A AGGCGTCCC A^fiCCGTGG AGCTCC ACTA CGATGCCC TCCCC CTGG ACgA 
GGATGACCGCCAfScTCC^ 

tctaccgWcggocccatgg^^ 

TGAGCTACAG CG AGTTTCCCCACGC CGAQAAGT AC CT AAACC CC ATCCTGCACC AGACCTACGCCA 
TCCCCGTCTACCAGGAGCAGATCAT^ 

GCGCACCTCCTCAGC^GGGCCATGGG GAAGAAGAAGCTGAnGAGATGCAGAAGCACCGTOAGC^g 
TTOTCC AGGGGGCCAAGG * a A r:GGGCGTGCCCOACG AGG AGCCCA ACC GCCTCTTTCAC ATGG C 

tggaggccttccccaactacgg cttcaa caaatctcatx^agcggcctacagcctcc^ctcctagc 

AGACCGCCTACCTGAAGGCrCACTACCCCGTCGAGTTCATGGCCGCCCTCrT^ 
ACOACTCCGATAAGOTraCCGAGTACA^^ 

CCOGACGTCAACCGCTCC^GGGTTTGA f *" | ' TT CTGGTCC:AGGGCCGGCAGATCCTCTTCGGC C TCXCC, 
GGGGTOAAAAACGTGGGCGAGCCGGCG ^^ 

CCTACCGGAGCCTCGGGGAC^TTCCTTAAGCG<^TGGACGAOAA GGTGCTCAACAAGCGGACCCT.G 
G AGTCC CTC ATC AAGGCGGOTGC CCTG ^aCGGCTTCGGGGAAAGGGCGCGQCTC CTCGCCTCCCT 



<WO 991 3060A 1TL> 



WO 39/13060 PCT/US»8/189« 

23/35 

FIGURE 17A (cont.) 



GG AGGGCCTCCTC AGG TGGGCGGCC G AG A CTCGCGAfiAA^CCCGCTCCGGCATGATGGGCCTCT 
TC AGCG AAGTGO AGG AGCCGCCTTTGGCCG AGGCCGC CCCCC TGGACGaGA TC ACC CCGC TCCGC 
TACGAAAAGfiAGGCC CTGGCOaTTI^ATCTTTCCGGCC ATCCC ATCTTGCGGTaTCC CGOGCTOCGtt 
G AC ACGCC C AC CTOC ACCCTGG AGGaGCTTCCCC AC CTGGCCCGCG AC CTGC CGCCCCGGTC TAG 
GOTCCrc CTCGCCGGC4T<X>T[>GAGGAGGTGGTGCGCAA 

CCCG'CTTCOT CCTCTCCG ACG AG ACttttCrGGCGCTTO Al'jK CGOTGGC CTTCGG C.CGPG CC'TaCGA C 
CAGGTCTCC CCG AGGC TC AAGG AGG AC ACCCCCGTCCTCGTCCTCGCCGAG GTGG AGCOGG A GG A 
GGGGGGCGTCCGGGTGCTGGCCC AGGCCCTTTTCG AC CT A CGAOO AGCTGGAGC AQGTCC CCCGGG 

CTGOACGAGOACGCGGGGACCCTCCCCCTGTACGTCCGGGTCCAGGGCGCCTTCGGCGACGCCCT 
CCTCGCCCTGAGGGAGGTGCGGGTGGGGGAGCAGGCCTTGGCGGCCCTCGAGOCCGAGGGCxTTCC 
GGOCCTACC TCCTGCCTG A CCGGGAGGTCCTCCTOC AGGG CGC.C CaGGCGOGGG AGGCC C A GG AG 
L?CGli ■ ^CC LlIL't Alr UtiLJUJ j U UUUUU L CiALrACAtJUt? J U'tJUA J U ( ^L- 1 UUUU liUU<J UC AAU U W,"U 1 
OO.0CCG A GCGCTTTGG GOTO 00 OAOCAAG GCCC TCOTO CCCT ACCGCQOCCGQCCO ATGG TGG AQTCJ CKtT 
CCTGG AGGCCC TCT ACOCCG CGGGGCTTTCCCOOG TGT ACGTGOGGG AO AACOCCGGCCTC 0 TTOCCOCU 
OCC^CTCTC ACCC7 TCCCG ACCGCGG AGG CC TOCTG Q A AAACCTO GAGCAG GCCCTGG AG C ACG TOO AGO 
GG CGCGTGCTCGTGGCC ACCGGGG ACATCCCCC ACCTG AC GG AGO AGOCGG TCCGCTTCG TTCTG 0 AT AA 
GGCTCCTG AG OC GGCCCTGG TCTAOCCCATTQTttCCCAAGG AGG C&Cl KB G AGGCCCGCTTTCCCC AG ACC 
AAGCGCACCT AC GCCCGCCTCCG GG AGGG G ACCTTC ACCGGCOG C AAOCTTCTCCTTTTGO AC AAG TCOC 
TCTTCCGGG AGGCCCTTOCOCTGGCGCGGC GGOT0 GT GGCCCTGCGC AAGAGGCCTTTGGCCCTOGCCCO 
CCTCGTGGGGTG GG AOGTO TTO CTG A AGCTCCTCCTG<iGCCGCCrCTCOCTC0CCG AG GTGG AGOCCCGG 
GCCCAAAGGATCC 
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FIGURE 17B 



W-ftKTiWAHLHQXTGFSLLDG^^ 

3, AE ARLW E.T 1.5 3 F KTP.F P J £ I ONHCLfEOKKV MEVL FARK7CUWW ATI* DGH Y V R K5tf . A-UtA W BULLA IQS K 5 TL DD P 
RWKFPC DE Fa V KTPILIIRAM r P D&F-WGDE P Fl?i IT V&I KRtfCI W ELPK3DKMV YRI FRFPLF A PJVTE.VJYLMLL TFKG1LF; 
RYP PR I FY P E V FFJ .UJJC]„P PI K5-OGLAI. AE ALAQVP RIL AW&RI J*K& L PPLAGVKGWT AE A I FH RAL Y ELS V IE PJHO F P 
GY FL I VQE>i I HW AREI ?G V S V- *i Et5 RG£ AAGSLV A YA V G 1 31 ? I BPLRFGLL FE RFLtfP ERV5HP DI DT DFS OP. F, F- DP V JO V V 
RE RY GED ^AQI GTLOSLAS KttLKD VARV ¥ G I FH KKABBL AKLI FV&FGiC PK PLQ EA I Q.VV PELFABM EK KV.R EVLE 
VAMRLEGLN Rtf^ VK AAtfVV 3 aXbPLTISlv Phtt RTOEGRP WTO V DMCft V EALGLLKMD FLGLRTLTFLDE V K?. J V KA SQO 
VELDYTOL P5/PPPKT FALLS RS&TKGV FO LfcSQGOTATtBGLKPRR FEDL I ft I RFG PM&H I FT7 I RPHji-SL E PV5 Y 
SWPH A&K WKP XMET YGI FVYflBQIMQ LASAVAOySJ-GE ADLLRRAMGKKKLRRCRST&3 ASS RGPRKGA? *P RRPT A 
5LTWL EJ*. FAH YGFN S LL3 YQTA* VtfAWY P V EfMAALLStf EPMDS DKV A£ Y I ADftftAHG I EVLPP 17/ UR5GF D F 

Lv<^'5 i i^r Ol^i^V r.W OiftArtG* I L KCrRDKOOFY ft^LOWLKfcLD &KV L-L'-i KKTIr & &L I ^jJ/iLMr *trv. ri - A ax,z,':>L 
L RWAAET P.EKA RSGMMGL FSEVEEP PLAEAAPLDEI TRLRY EKEAL'j I Y VBGH PILRY P^lRETATt TLB ELPHLA Kl-L^ 
P RS RVLLAOMV EEVV RKPTKSG GWhL\REVLSPE1TGALEA VfcFGRA YDQ VSPRLKEDTPVLV LAE V EREEGC VKvLAO AV W 
T sc EELEQV FPALEV LV E AS LLDDRGV AHLKSLXDEH.^.CTLPLy WRVQCftTOE ALLft LftEV RV GE EAL AAL EJ.E: i FRAY LL 
P DftEV LLO'SGO-AiSEijC-E A VP F 
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KIGURE 17C 
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Ff-GURE17C (cont) 
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FIGURE 17D 



M AGCLN DIFEAQKi EWHHHH HH I VP RGSGGGGLQNLGRKLR FAHLHQH TQFS LLDGAAK 
LSDLU; W VKETTPEDP ALAMTDHONLFG A VEF YKK. ATEMGIKp) LGYEAY V AAESRFDRK 
RGKGLDGG YFHLTLL AK.DFTG YQN1 ,VRLASRA YLEGFY EKPRIDRE1LREH AEGL1 ALSGC 
LGAEIPQFILQDRLDLAE ARLNE YLSIFKDRFFIEIQI^fHGLPEOKK WE VLKEFAR fC YC U3 M 
V ATNDGH Y VRKEDARAHE VLL Al 0$ KSTLDDPGR WRFPCDEF Y VKTPEEMRAMFPEEEW 
GDEPFDNT^'EIARMCNVELPIGDKMVYRIPRPPLPARRTKAQYLMELTFKGLLRRYPDRITE 
GFYREVFRLLGKLPPHGDGEAL/ t EALAQVEItEAWERLMKSLP^LAGVKEWL'.AlSA]I-HRAL 
Y ELSV] ER MGFPGYFU VQD YIN W ARRNGVS VGPGRGS A AGSL V A Y A VGITN 1 DPLRFGLLF 
ERPLNPERV&IMPDIOTDFSDRERDRVTgYVRERYGEDlCVAgKjTLGSLASKAALKDVARVY 
GJPHKJCAEELAKUPVQFGKPKPLQEAEQVVPELRAEMEKDPJCVREVLEVAMRLEGLNRHA 
S VH AAG VV] AA EPLTDL VPLMRDQEGRP^'TQ YDMGA VEALGLLKMDFLGLRTLTFLDE V 
KRIVKASQGVELDYDALPLDDPKTFALLSRGETKGVFQLESGGMTATLRGLKPRRFEDLIAT 
LSL YRPGPMEH I PT YIRRHHGLEP VS YS EFPH AEK YLKPI LDET YG1P V YQEQIMQI AS A V AG 
YSLGEADLLRRAMGKKKLRRCRSTGSASSRGPRKGACPRKRPTASLTWLEAFANYGFNKS 
HAAA YSLLS YQTA YVKAH YPVEFM A4LLS VERHDS DK V AEYIRD ARAMGI E VLPPD VNRS 
GFDFL VQGRQILFGL S A VKN VGEAAAEA! LRERERGGP YRS LGDFLKPXDEKVLNiCRTLES 
UKAG Al JDC FGERARLL ASLEGLLRWAA tTREK ARSGW M GLFSEVEEPPLAEA APLDFJTR 
L, RYEKE ALO 1WS OHPIL R VPOL R-ETATCTLEEL 7 HL ARDL P PRS R VL LAOMVE E V VRJCPTK. 

SGGMMARFVLSDETGALEAVAFGRAYIXJVSPRLKEDTPVLVLAEVEREEGGVRVLAQAV 
WT YEELEQ VPRALEVEVEASLLDDRCi V AH LK.S LLDEH AGTLFLY VRVQGAFG EALL A [.RE 
VRVGEEALAALEAEGFRAYLLPDREVLLQGGQAGEAQEAVPF 
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FIGURE I SB 
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FIGURE 18C 
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FIGURE 18D 
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FIGURE 1<>A 



ctgatgc:.4Cgccgtgggccactccgtggcc.'\agcgcttcccccacctc;aaga 

r ['lC^GTACGl''[''J'CCACCKtAAAC'nTCAGCAACCiAGCTCATCAACGCC.\TCCGG 
: a AGG ACCCG ATG A COG.* OTTCCGOG A GCGOTACCGCTCCGTG g.\c ctc ctg c 

tggj o g aco acgtcca gttc atcgccggaa agg agcgc accc agg ac gagt r 
tttc c ac acc'ltc aacgccctttacg aggccc a t a agc ag at'c^ tcctctcct 
ccgaccogccgccc/^goacatcctcaccctggaggcgcgcctgcgc;a(;cc<:t 
cn f'tg agt g gg gcttg atcaccg acatcc agcccccgg acctgg a a acccgg . 
atcgcc atc c r( i a ag a ]"g a acc jccg a cjc agco gg gccto ago atccccg ag 
crac«ccctggagtacatcgcccggcaggtcacctccaaca-[^ccgggagcvgg 

AAGGGGCC C" I 'C ATf iCCiGCjCATCGCCTTCGCCTCGCTCAACGGCGTTG ACGTG 
ACCCGCGCCGTGGCCGCC A A SSSSTCTGCt r AC ATcTTCGCCGCCAGGG AGCTG 

.(. i aggcg accc c'h gg a g a' vc atccgc a a a gtggcg acc acttcg gcctg aaa 
ccgg a oa gctc acgggg ag cgg ccg aag aagg a g gtggtcctcccccggc a 
gctcgcc a" i gtacctggtgcgggagctc acccggg cctccctgcccg a gatc 
" ggccagctcttcggcgoccgggaccacaccacggtcctctacgccatccaga 
AG gtccagg agctcgcgg aaagc gaccgggaggtg cagggcctc CTCCC3C A 

CCCTCCGG G A AGGCGT 1 5C ACA'IG A ACCC CTC TGGAT A A CC TC: TC C ATA A CC 
CTGTGCATAACCGGGRCTCA.AAGCTGTGGATAACCGTGTGGAT.VACCTG'['G 

-OA AA actccgggggggccgaaccc ctgtggataa cccgcact ttatccaca 

GG TT ATCC7 A C A G GCCGGGGCC AG TT A TCCACA GCAGTTAATCCACATGCCC 
CTTJ'CTTG TOCTGGAG GGAGTTTIX'.CACCGCC TTA'1'ACACA A7'TG : J C TCCAC A 
GC I ACCTACTACT ACG ACT ACG'I A'L'CTTTT AAA AGTTTT A A A A A t i A C3A GTCGTA 
AT AGCC, ^TT A A AAAAGGC G' ['CGGGCA ATGAAC A ATA ACGOTTCCC AAGA ACT 
C CTCT A C 3G ACC A GCTTTCC CT AC CTGGTGAG ATGT 
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FIGURE 19B 
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